Compare commits

...

91 Commits

Author SHA1 Message Date
Maverick Studer
591809336f Merge branch 'RED-9865-bp' into 'release/4.244.x'
RED-9865: fix for case 2

See merge request redactmanager/redaction-service!494
2024-08-23 17:01:17 +02:00
Maverick Studer
61be29a327 RED-9865: fix for case 2 2024-08-23 17:01:16 +02:00
Dominique Eifländer
debc014cf0 Merge branch 'RED-9274-4.0' into 'release/4.244.x'
RED-9274: Set state REMOVED if last change is REMOVED in migration

See merge request redactmanager/redaction-service!433
2024-06-17 13:15:40 +02:00
Dominique Eifländer
a8ad213cad RED-9274: Set state REMOVED if last change is REMOVED in migration 2024-06-17 13:04:03 +02:00
Dominique Eifländer
436d0db23c Merge branch 'hotFixeExceptions-4.0' into 'release/4.244.x'
hotfix: show stacktrace when exceptions occurs

See merge request redactmanager/redaction-service!428
2024-06-14 15:37:46 +02:00
Dominique Eifländer
1a5a2dbe10 hotfix: show stacktrace when exceptions occurs 2024-06-14 15:25:27 +02:00
Corina Olariu
f05f1fd9b3 Merge branch 'RED-9132-bp' into 'release/4.244.x'
RED-9132 - Remove debug information from Paragraph/Location field

See merge request redactmanager/redaction-service!402
2024-05-17 14:18:44 +02:00
Corina Olariu
91385de867 RED-9132 - Remove debug information from Paragraph/Location field 2024-05-17 14:18:40 +02:00
Maverick Studer
02fda23461 Merge branch 'RED-9091' into 'release/4.244.x'
RED-9091: Cannot re-add manual redaction on same position with same reason

See merge request redactmanager/redaction-service!388
2024-05-03 10:03:07 +02:00
Maverick Studer
989eb2145e RED-9091: Cannot re-add manual redaction on same position with same reason 2024-05-03 10:03:07 +02:00
Kilian Schüttler
4b2db3126d Merge branch 'RED-9042' into 'release/4.244.x'
RED-9042: fix recategorization merge with legal-basis

See merge request redactmanager/redaction-service!386
2024-04-25 17:52:01 +02:00
Kilian Schuettler
f1deb374ba RED-9042: fix recategorization merge with legal-basis 2024-04-25 17:22:22 +02:00
Dominique Eifländer
3bd1fbebae Merge branch 'RED-7384' into 'release/4.244.x'
Resolve RED-7384

See merge request redactmanager/redaction-service!384
2024-04-24 11:50:13 +02:00
Kilian Schüttler
8961abc4fe Resolve RED-7384 2024-04-24 11:50:13 +02:00
Kilian Schüttler
b01288d85e Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: improve performance significantly

See merge request redactmanager/redaction-service!381
2024-04-23 12:51:07 +02:00
Kilian Schüttler
e6c048d6df RED-7384: improve performance significantly 2024-04-23 12:51:07 +02:00
Kilian Schüttler
93c1a2b90a Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: fix imported stuff

See merge request redactmanager/redaction-service!367
2024-04-11 14:08:45 +02:00
Kilian Schüttler
07d248c157 RED-7384: fix imported stuff 2024-04-11 14:08:45 +02:00
Kilian Schüttler
fa870fa856 Merge branch 'RED-8690-bp' into 'release/4.244.x'
RED-8690: Overlapping SKIPPED and APPLIED of same type

See merge request redactmanager/redaction-service!365
2024-04-10 14:52:26 +02:00
Kilian Schüttler
a2e72f8a62 RED-8690: Overlapping SKIPPED and APPLIED of same type 2024-04-10 14:52:26 +02:00
Kilian Schüttler
7a9b1c65d7 Merge branch 'RED-8905-bp' into 'release/4.244.x'
RED-8905: DM: File in error state when using getPreviousSibling()

See merge request redactmanager/redaction-service!363
2024-04-04 16:11:14 +02:00
Kilian Schuettler
960daf6840 RED-8905: DM: File in error state when using getPreviousSibling() 2024-04-04 16:02:09 +02:00
Andrei Isvoran
a1ffbab5a6 Merge branch 'RED-8877-bp' into 'release/4.244.x'
RED-8877 - Remove X.1.0 rule

See merge request redactmanager/redaction-service!360
2024-04-03 15:38:31 +02:00
Andrei Isvoran
a2a3e84031 RED-8877 - Remove X.1.0 rule 2024-04-03 15:38:30 +02:00
Kilian Schüttler
22ec044503 Merge branch 'RED-8773-bp' into 'release/4.244.x'
RED-8773 - Wrong value for recategorized and forced logo - backport

See merge request redactmanager/redaction-service!331
2024-04-03 15:18:49 +02:00
Kilian Schüttler
72f324bb11 Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: handle pending dict application in redaction-service instead of persistence

See merge request redactmanager/redaction-service!356
2024-04-03 15:17:03 +02:00
Kilian Schüttler
5031b17a7d RED-7384: handle pending dict application in redaction-service instead of persistence 2024-04-03 15:17:03 +02:00
Andrei Isvoran
8187983944 Merge branch 'RED-8776-bp' into 'release/4.244.x'
RED-8776 - Add local redaction when we do a manual change on a non-manual redaction

See merge request redactmanager/redaction-service!358
2024-04-03 14:55:45 +02:00
Andrei Isvoran
99951dd1bb RED-8776 - Add local redaction when we do a manual change on a non-manual redaction 2024-04-03 14:02:25 +03:00
Kilian Schüttler
67184eb861 Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: also migrate annotation Ids with unprocessed manual changes

See merge request redactmanager/redaction-service!355
2024-04-02 15:01:41 +02:00
Kilian Schuettler
a72d96c646 RED-7384: also migrate annotation Ids with unprocessed manual changes 2024-04-02 14:53:24 +02:00
Andrei Isvoran
e5641d7097 Merge branch 'RED-8828-fix-npe-bp' into 'release/4.244.x'
RED-8828 - Fix error when resizing dict based redaction

See merge request redactmanager/redaction-service!354
2024-04-02 08:33:37 +02:00
Andrei Isvoran
62c0a7eb09 RED-8828 - Fix error when resizing dict based redaction 2024-03-29 14:32:45 +02:00
Kilian Schüttler
7dbc585274 Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: migration fixes, ignore non-local manual changes in migration

See merge request redactmanager/redaction-service!344
2024-03-27 18:09:10 +01:00
Kilian Schüttler
981901012c RED-7384: migration fixes, ignore non-local manual changes in migration 2024-03-27 18:09:10 +01:00
Andrei Isvoran
9b7e5bcca1 Merge branch 'RED-8840-bp' into 'release/4.244.x'
RED-8840 - Adjust rules

See merge request redactmanager/redaction-service!350
2024-03-27 15:33:45 +01:00
Andrei Isvoran
8e8864fdfa RED-8840 - Adjust rules 2024-03-27 16:15:48 +02:00
Ali Oezyetimoglu
e3031aa716 Merge branch 'RED-8480-bp2' into 'release/4.244.x'
RED-8480: addded property "value" to places with recategorizations

See merge request redactmanager/redaction-service!347
2024-03-27 11:23:23 +01:00
Ali Oezyetimoglu
793bbc6bde RED-8480: addded property "value" to places with recategorizations 2024-03-27 11:11:48 +01:00
Dominique Eifländer
895d56c05f Merge branch 'RED-8834-4.0' into 'release/4.244.x'
RED-8834: Fixed text entities with empty text range

See merge request redactmanager/redaction-service!346
2024-03-27 10:02:00 +01:00
Dominique Eifländer
bbac9866d9 RED-8834: Fixed text entities with empty text range 2024-03-27 09:47:00 +01:00
Kilian Schüttler
0e4e615d4a Merge branch 'RED-8854' into 'release/4.244.x'
RED-8854: Recategorization from formula/image/logo to signature is not displayed in report

See merge request redactmanager/redaction-service!342
2024-03-26 12:40:16 +01:00
Kilian Schüttler
c5b83bf6e0 RED-8854: Recategorization from formula/image/logo to signature is not displayed in report 2024-03-26 12:40:16 +01:00
Kilian Schüttler
455137131c Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: migration fixes

See merge request redactmanager/redaction-service!339
2024-03-26 10:00:07 +01:00
Kilian Schüttler
097bc34568 RED-7384: migration fixes 2024-03-26 10:00:07 +01:00
Andrei Isvoran
3deb002eb7 Merge branch 'RED-8840-bp' into 'release/4.244.x'
RED-8840 - Add PII.4 rules for sanitisation with correct legal basis

See merge request redactmanager/redaction-service!336
2024-03-25 16:36:43 +01:00
Andrei Isvoran
d8684595a3 RED-8840 - Add PII.4 rules for sanitisation with correct legal basis 2024-03-25 16:36:43 +01:00
Dominique Eifländer
3bfd41bd5e Merge branch 'RED-7384' into 'release/4.244.x'
RED-7384: fixes for migration

See merge request redactmanager/redaction-service!334
2024-03-21 14:27:32 +01:00
Kilian Schüttler
b9ce6f8c85 RED-7384: fixes for migration 2024-03-21 14:27:31 +01:00
Ali Oezyetimoglu
5647bc16f5 Merge branch 'RED-8480-bp' into 'release/4.244.x'
RED-8480: updated code according to changes from ManualRecategorization

See merge request redactmanager/redaction-service!333
2024-03-21 11:43:46 +01:00
Ali Oezyetimoglu
b55f33e39a RED-8480: updated code according to changes from ManualRecategorization 2024-03-21 09:01:57 +01:00
Andrei Isvoran
82d4fd567b Merge branch 'RED-8784-bp' into 'release/4.244.x'
RED-8784 - Change PII.9.1/PII.9.2 to CBI.23.0/CBI.23.1

See merge request redactmanager/redaction-service!328
2024-03-19 12:32:23 +01:00
Andrei Isvoran
7868b56993 RED-8784 - Change PII.9.1/PII.9.2 to CBI.23.0/CBI.23.1 2024-03-19 12:32:23 +01:00
Corina Olariu
afc83db249 RED-8773 - Wrong value for recategorized and forced logo - backport
- use image.getValue instead of image.value() so that following recategorizations for images will get the updated value
2024-03-18 18:19:38 +02:00
Kilian Schüttler
0d53e65a0c Merge branch 'RED-7384-bp' into 'release/4.244.x'
RED-7384-bp: add useful fields to ManualRedactionEntry

See merge request redactmanager/redaction-service!313
2024-03-18 13:17:11 +01:00
Kilian Schüttler
f641824270 RED-7384-bp: add useful fields to ManualRedactionEntry 2024-03-18 13:17:11 +01:00
Dominique Eifländer
07d6fea992 RED-7384: Ignore redactionLog entries on non existing pages for migration 2024-03-18 11:04:33 +01:00
Andrei Isvoran
88888545e5 Merge branch 'RED-8680' into 'release/4.244.x'
RED-8680 - Add specific CBI rules for seeds

See merge request redactmanager/redaction-service!323
2024-03-18 10:58:42 +01:00
Andrei Isvoran
7777e74a44 RED-8680 - Add specific CBI rules for seeds 2024-03-15 16:01:31 +02:00
Andrei Isvoran
187f7c95e0 Merge branch 'RED-8705-bp' into 'release/4.244.x'
RED-8705 - Fix image type

See merge request redactmanager/redaction-service!322
2024-03-15 09:07:02 +01:00
Andrei Isvoran
69d0ab0754 RED-8705 - Fix image type 2024-03-14 17:14:31 +02:00
Andrei Isvoran
9101b53970 Merge branch 'RED-8680-seeds-bp' into 'release/4.244.x'
RED-8680 - Add rules for syngenta sanitisation seeds

See merge request redactmanager/redaction-service!319
2024-03-14 12:36:39 +01:00
Andrei Isvoran
3579c06033 RED-8680 - Add rules for syngenta sanitisation seeds 2024-03-13 16:59:14 +02:00
Andrei Isvoran
dbb89321c4 Merge branch 'RED-8645-more-fixes' into 'release/4.244.x'
RED-8645 - Fix some more rules

See merge request redactmanager/redaction-service!317
2024-03-13 08:47:52 +01:00
Andrei Isvoran
1c3c632fd2 RED-8645 - Fix some more rules 2024-03-13 08:47:52 +01:00
Dominique Eifländer
76f587aae4 RED-7384: Fixed missing requestDate in new created manualRedactions for resizes 2024-03-12 12:00:31 +01:00
Dominique Eifländer
dc7910cd07 RED-7384: Fixed migration problem for a specific file 2024-03-12 09:34:39 +01:00
Andrei Isvoran
d877b362ef Merge branch 'RED-8645-rules-bp' into 'release/4.244.x'
RED-8645 - Update RM rules

See merge request redactmanager/redaction-service!310
2024-03-08 14:34:04 +01:00
Andrei Isvoran
364f994ffd RED-8645 - Update RM rules 2024-03-08 14:34:03 +01:00
Kilian Schüttler
4291fe56f6 Merge branch 'image-name-backport' into 'release/4.244.x'
Image name backport

See merge request redactmanager/redaction-service!307
2024-03-05 16:58:13 +01:00
Kilian Schüttler
ec04a902d5 Image name backport 2024-03-05 16:58:12 +01:00
Corina Olariu
a5969dbf8c Merge branch 'RED-8590-backport' into 'release/4.244.x'
RED-8590 - Missing reason for added CBI Address, recategorized LOGO and signature

See merge request redactmanager/redaction-service!300
2024-03-01 13:29:48 +01:00
Corina Olariu
0b833f2f22 RED-8590 - Missing reason for added CBI Address, recategorized LOGO and signature 2024-03-01 13:29:48 +01:00
Andrei Isvoran
bee535715d Merge branch 'RED-8586-bp-dossier-redactions' into 'release/4.244.x'
RED-8586 - Don't treat dossier redactions differently

See merge request redactmanager/redaction-service!301
2024-03-01 12:38:44 +01:00
Andrei Isvoran
8362296edd RED-8586 - Don't treat dossier redactions differently 2024-03-01 12:38:43 +01:00
Andrei Isvoran
ec6cf3efb0 Merge branch 'RED-8586-fix-bp' into 'release/4.244.x'
RED-8586 - Add higher salience to rule ETC.5.1

See merge request redactmanager/redaction-service!298
2024-02-29 15:42:16 +01:00
Andrei Isvoran
cde17075e5 RED-8586 - Add higher salience to rule ETC.5.1 2024-02-29 16:26:51 +02:00
Maverick Studer
27ae7d6d96 Merge branch 'RED-8550-bp' into 'release/4.244.x'
RED-8550: Faulty table recognition and text duplication leads to huge sections

See merge request redactmanager/redaction-service!297
2024-02-29 15:01:53 +01:00
Maverick Studer
62ccf46069 RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-29 15:01:53 +01:00
Andrei Isvoran
2bc3440a14 Merge branch 'RED-8586-backport' into 'release/4.244.x'
RED-8586 - Fix confidentiality rules

See merge request redactmanager/redaction-service!295
2024-02-29 12:58:11 +01:00
Andrei Isvoran
8820eb696c RED-8586 - Fix confidentiality rules 2024-02-28 17:22:45 +02:00
Kilian Schüttler
172c9dc3ee Merge branch 'RED-8615-bp' into 'release/4.244.x'
RED-8615: backport

See merge request redactmanager/redaction-service!292
2024-02-28 13:12:54 +01:00
Kilian Schüttler
d488bee1bb RED-8615: backport 2024-02-28 13:12:54 +01:00
Kilian Schüttler
b99d9ed59c Merge branch 'RED-7384-bp' into 'release/4.244.x'
RED-7384: fixes for migration backport

See merge request redactmanager/redaction-service!288
2024-02-27 10:04:38 +01:00
Kilian Schüttler
8175f6d012 RED-7384: fixes for migration backport 2024-02-27 10:04:38 +01:00
Kilian Schüttler
8f6b242c40 Merge branch 'hotfix-bp' into 'release/4.244.x'
Hotfix bp

See merge request redactmanager/redaction-service!286
2024-02-26 17:55:40 +01:00
Kilian Schüttler
12a22bbb47 Hotfix bp 2024-02-26 17:55:40 +01:00
Corina Olariu
e620412f09 Merge branch 'RED-8589-4.0.0' into 'release/4.244.x'
RED-8589 - Add "MANUAL" engine to all annotations that has entries in...

See merge request redactmanager/redaction-service!287
2024-02-23 13:22:53 +01:00
Corina Olariu
ac38a966c5 RED-8589 - Add "MANUAL" engine to all annotations that has entries in... 2024-02-23 13:22:53 +01:00
Maverick Studer
89227ec850 Merge branch 'RED-8607-bp' into 'release/4.244.x'
RED-8607: Higher rank hint removed if overlaps lower rank redaction

See merge request redactmanager/redaction-service!284
2024-02-22 16:15:33 +01:00
Maverick Studer
80fdff803d RED-8607: Higher rank hint removed if overlaps lower rank redaction 2024-02-22 16:15:32 +01:00
63 changed files with 5810 additions and 1772 deletions

View File

@ -4,10 +4,11 @@ plugins {
} }
description = "redaction-service-api-v1" description = "redaction-service-api-v1"
val persistenceServiceVersion = "2.349.79"
dependencies { dependencies {
implementation("org.springframework:spring-web:6.0.12") implementation("org.springframework:spring-web:6.0.12")
implementation("com.iqser.red.service:persistence-service-internal-api-v1:2.338.0") implementation("com.iqser.red.service:persistence-service-internal-api-v1:${persistenceServiceVersion}")
} }
publishing { publishing {

View File

@ -1,12 +1,15 @@
package com.iqser.red.service.redaction.v1.model; package com.iqser.red.service.redaction.v1.model;
import java.util.Collections;
import java.util.Set;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import lombok.AllArgsConstructor; import lombok.AllArgsConstructor;
import lombok.Builder; import lombok.Builder;
import lombok.Data; import lombok.Data;
import lombok.NoArgsConstructor; import lombok.NoArgsConstructor;
import lombok.NonNull;
@Data @Data
@Builder @Builder
@ -14,9 +17,18 @@ import lombok.NoArgsConstructor;
@AllArgsConstructor @AllArgsConstructor
public class MigrationRequest { public class MigrationRequest {
@NonNull
String dossierTemplateId; String dossierTemplateId;
@NonNull
String dossierId; String dossierId;
@NonNull
String fileId; String fileId;
boolean fileIsApproved;
@NonNull
ManualRedactions manualRedactions; ManualRedactions manualRedactions;
@NonNull
@Builder.Default
Set<String> entitiesWithComments = Collections.emptySet();
} }

View File

@ -12,11 +12,11 @@ plugins {
description = "redaction-service-server-v1" description = "redaction-service-server-v1"
val layoutParserVersion = "0.86.0" val layoutParserVersion = "0.89.11"
val jacksonVersion = "2.15.2" val jacksonVersion = "2.15.2"
val droolsVersion = "9.44.0.Final" val droolsVersion = "9.44.0.Final"
val pdfBoxVersion = "3.0.0" val pdfBoxVersion = "3.0.0"
val persistenceServiceVersion = "2.338.0" val persistenceServiceVersion = "2.349.79"
val springBootStarterVersion = "3.1.5" val springBootStarterVersion = "3.1.5"
configurations { configurations {
@ -65,6 +65,7 @@ dependencies {
testImplementation("org.apache.pdfbox:pdfbox-tools:${pdfBoxVersion}") testImplementation("org.apache.pdfbox:pdfbox-tools:${pdfBoxVersion}")
testImplementation("org.springframework.boot:spring-boot-starter-test:${springBootStarterVersion}") testImplementation("org.springframework.boot:spring-boot-starter-test:${springBootStarterVersion}")
testImplementation("com.knecon.fforesight:viewer-doc-processor:${layoutParserVersion}")
testImplementation("com.knecon.fforesight:layoutparser-service-processor:${layoutParserVersion}") { testImplementation("com.knecon.fforesight:layoutparser-service-processor:${layoutParserVersion}") {
exclude( exclude(
group = "com.iqser.red.service", group = "com.iqser.red.service",

View File

@ -30,4 +30,10 @@ public class RedactionServiceSettings {
private int droolsExecutionTimeoutSecs = 300; private int droolsExecutionTimeoutSecs = 300;
public int getDroolsExecutionTimeoutSecs(int numberOfPages) {
return (int) Math.max(getDroolsExecutionTimeoutSecs(), getDroolsExecutionTimeoutSecs() * ((float) numberOfPages / 1000));
}
} }

View File

@ -18,6 +18,7 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.type.DictionaryEntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine;
@ -45,7 +46,7 @@ public class LegacyRedactionLogMergeService {
public RedactionLog addManualAddEntriesAndRemoveSkippedImported(RedactionLog redactionLog, ManualRedactions manualRedactions, String dossierTemplateId) { public RedactionLog addManualAddEntriesAndRemoveSkippedImported(RedactionLog redactionLog, ManualRedactions manualRedactions, String dossierTemplateId) {
Set<String> skippedImportedRedactions = new HashSet<>(); Set<String> skippedImportedRedactions = new HashSet<>();
log.info("Merging Redaction log with manual redactions"); log.info("Adding manual add Entries and removing skipped or imported entries");
if (manualRedactions != null) { if (manualRedactions != null) {
var manualRedactionLogEntries = addManualAddEntries(manualRedactions.getEntriesToAdd(), redactionLog.getAnalysisNumber()); var manualRedactionLogEntries = addManualAddEntries(manualRedactions.getEntriesToAdd(), redactionLog.getAnalysisNumber());
@ -93,6 +94,15 @@ public class LegacyRedactionLogMergeService {
} }
public long getNumberOfAffectedAnnotations(ManualRedactions manualRedactions) {
return createManualRedactionWrappers(manualRedactions).stream()
.map(ManualRedactionWrapper::getId)
.distinct()
.count();
}
private List<ManualRedactionWrapper> createManualRedactionWrappers(ManualRedactions manualRedactions) { private List<ManualRedactionWrapper> createManualRedactionWrappers(ManualRedactions manualRedactions) {
List<ManualRedactionWrapper> manualRedactionWrappers = new ArrayList<>(); List<ManualRedactionWrapper> manualRedactionWrappers = new ArrayList<>();
@ -193,7 +203,12 @@ public class LegacyRedactionLogMergeService {
} }
redactionLogEntry.getManualChanges() redactionLogEntry.getManualChanges()
.add(ManualChange.from(imageRecategorization).withManualRedactionType(ManualRedactionType.RECATEGORIZE).withChange("type", imageRecategorization.getType())); .add(ManualChange.from(imageRecategorization)
.withManualRedactionType(ManualRedactionType.RECATEGORIZE)
.withChange("type", imageRecategorization.getType())
.withChange("section", imageRecategorization.getSection())
.withChange("legalBasis", imageRecategorization.getLegalBasis())
.withChange("value", imageRecategorization.getValue()));
} }
@ -365,13 +380,17 @@ public class LegacyRedactionLogMergeService {
@SuppressWarnings("PMD.UselessParentheses") @SuppressWarnings("PMD.UselessParentheses")
private boolean shouldCreateManualEntry(ManualRedactionEntry manualRedactionEntry) { private boolean shouldCreateManualEntry(ManualRedactionEntry manualRedactionEntry) {
if (!manualRedactionEntry.isApproved()) { if (manualRedactionEntry.getDictionaryEntryType() != null //
&& (manualRedactionEntry.getDictionaryEntryType().equals(DictionaryEntryType.FALSE_POSITIVE) //
|| manualRedactionEntry.getDictionaryEntryType().equals(DictionaryEntryType.FALSE_RECOMMENDATION))) {
return false; return false;
} }
return (!manualRedactionEntry.isAddToDictionary() && !manualRedactionEntry.isAddToDossierDictionary()) || ((manualRedactionEntry.isAddToDictionary() if (manualRedactionEntry.getProcessedDate() == null) {
|| manualRedactionEntry.isAddToDossierDictionary()) return false;
&& manualRedactionEntry.getProcessedDate() == null); }
return (!manualRedactionEntry.isAddToDictionary() && !manualRedactionEntry.isAddToDossierDictionary());
} }

View File

@ -0,0 +1,101 @@
package com.iqser.red.service.redaction.v1.server.migration;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
public class MigrationMapper {
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change toEntityLogChanges(Change change) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change(change.getAnalysisNumber(),
toEntityLogType(change.getType()),
change.getDateTime());
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange toEntityLogManualChanges(com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange manualChange) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange(toManualRedactionType(manualChange.getManualRedactionType()),
manualChange.getProcessedDate(),
manualChange.getRequestedDate(),
manualChange.getUserId(),
manualChange.getPropertyChanges());
}
public static ChangeType toEntityLogType(com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType type) {
return switch (type) {
case ADDED -> ChangeType.ADDED;
case REMOVED -> ChangeType.REMOVED;
case CHANGED -> ChangeType.CHANGED;
};
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType toManualRedactionType(ManualRedactionType manualRedactionType) {
return switch (manualRedactionType) {
case ADD_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD;
case ADD_TO_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD_TO_DICTIONARY;
case REMOVE_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE;
case REMOVE_FROM_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE_FROM_DICTIONARY;
case FORCE_REDACT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE;
case FORCE_HINT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE;
case RECATEGORIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RECATEGORIZE;
case LEGAL_BASIS_CHANGE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.LEGAL_BASIS_CHANGE;
case RESIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RESIZE;
};
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine toEntityLogEngine(Engine engine) {
return switch (engine) {
case DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.DICTIONARY;
case NER -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.NER;
case RULE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.RULE;
};
}
public static Set<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine> getMigratedEngines(RedactionLogEntry entry) {
Set<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine> engines = new HashSet<>();
if (entry.isImported()) {
engines.add(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.IMPORTED);
}
if (entry.getEngines() == null) {
return engines;
}
entry.getEngines()
.stream()
.map(MigrationMapper::toEntityLogEngine)
.forEach(engines::add);
return engines;
}
public List<ManualChange> migrateManualChanges(List<com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange> manualChanges) {
if (manualChanges == null) {
return Collections.emptyList();
}
return manualChanges.stream()
.map(MigrationMapper::toEntityLogManualChanges)
.toList();
}
}

View File

@ -57,20 +57,31 @@ public class MigrationMessageReceiver {
if (redactionLog.getAnalysisVersion() == 0) { if (redactionLog.getAnalysisVersion() == 0) {
redactionLog = legacyVersion0MigrationService.mergeDuplicateAnnotationIds(redactionLog); redactionLog = legacyVersion0MigrationService.mergeDuplicateAnnotationIds(redactionLog);
} else if (migrationRequest.getManualRedactions() != null) { } else {
redactionLog = legacyRedactionLogMergeService.addManualAddEntriesAndRemoveSkippedImported(redactionLog, migrationRequest.getManualRedactions(), migrationRequest.getDossierTemplateId()); redactionLog = legacyRedactionLogMergeService.addManualAddEntriesAndRemoveSkippedImported(redactionLog,
migrationRequest.getManualRedactions(),
migrationRequest.getDossierTemplateId());
} }
MigratedEntityLog migratedEntityLog = redactionLogToEntityLogMigrationService.migrate(redactionLog, document, migrationRequest.getDossierTemplateId(), migrationRequest.getManualRedactions()); MigratedEntityLog migratedEntityLog = redactionLogToEntityLogMigrationService.migrate(redactionLog,
document,
migrationRequest.getDossierTemplateId(),
migrationRequest.getManualRedactions(),
migrationRequest.getFileId(),
migrationRequest.getEntitiesWithComments(),
migrationRequest.isFileIsApproved());
log.info("Storing migrated entityLog and ids to migrate in DB for file {}", migrationRequest.getFileId());
redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.ENTITY_LOG, migratedEntityLog.getEntityLog()); redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.ENTITY_LOG, migratedEntityLog.getEntityLog());
redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.MIGRATED_IDS, migratedEntityLog.getMigratedIds()); redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.MIGRATED_IDS, migratedEntityLog.getMigratedIds());
sendFinished(MigrationResponse.builder().dossierId(migrationRequest.getDossierId()).fileId(migrationRequest.getFileId()).build()); sendFinished(MigrationResponse.builder().dossierId(migrationRequest.getDossierId()).fileId(migrationRequest.getFileId()).build());
log.info("Migrated {} redactionLog entries for dossierId {} and fileId {}", log.info("Migrated {} redactionLog entries, found {} annotation ids for migration in the db, {} new manual entries, for dossierId {} and fileId {}",
migratedEntityLog.getEntityLog().getEntityLogEntry().size(), migratedEntityLog.getEntityLog().getEntityLogEntry().size(),
migrationRequest.getDossierId(), migratedEntityLog.getMigratedIds().getMappings().size(),
migrationRequest.getFileId()); migratedEntityLog.getMigratedIds().getManualRedactionEntriesToAdd().size(),
migrationRequest.getDossierId(),
migrationRequest.getFileId());
log.info(""); log.info("");
} }

View File

@ -8,6 +8,7 @@ import java.util.LinkedList;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.Optional; import java.util.Optional;
import java.util.Set;
import java.util.function.Function; import java.util.function.Function;
import java.util.stream.Collectors; import java.util.stream.Collectors;
import java.util.stream.Stream; import java.util.stream.Stream;
@ -19,29 +20,26 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Rectangle; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogLegalBasis; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogLegalBasis;
import com.iqser.red.service.redaction.v1.model.MigrationRequest;
import com.iqser.red.service.redaction.v1.server.model.PrecursorEntity;
import com.iqser.red.service.redaction.v1.server.model.MigratedEntityLog; import com.iqser.red.service.redaction.v1.server.model.MigratedEntityLog;
import com.iqser.red.service.redaction.v1.server.model.MigrationEntity; import com.iqser.red.service.redaction.v1.server.model.MigrationEntity;
import com.iqser.red.service.redaction.v1.server.model.PrecursorEntity;
import com.iqser.red.service.redaction.v1.server.model.RectangleWithPage; import com.iqser.red.service.redaction.v1.server.model.RectangleWithPage;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType; import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService; import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService; import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityEnrichmentService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility; import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFromPrecursorCreationService;
import com.iqser.red.service.redaction.v1.server.utils.IdBuilder; import com.iqser.red.service.redaction.v1.server.utils.IdBuilder;
import com.iqser.red.service.redaction.v1.server.utils.MigratedIdsCollector; import com.iqser.red.service.redaction.v1.server.utils.MigratedIdsCollector;
@ -59,18 +57,26 @@ public class RedactionLogToEntityLogMigrationService {
private static final double MATCH_THRESHOLD = 10; private static final double MATCH_THRESHOLD = 10;
EntityFindingUtility entityFindingUtility; EntityFindingUtility entityFindingUtility;
EntityEnrichmentService entityEnrichmentService;
DictionaryService dictionaryService; DictionaryService dictionaryService;
ManualChangesApplicationService manualChangesApplicationService; ManualChangesApplicationService manualChangesApplicationService;
public MigratedEntityLog migrate(RedactionLog redactionLog, Document document, String dossierTemplateId, ManualRedactions manualRedactions) { public MigratedEntityLog migrate(RedactionLog redactionLog,
Document document,
String dossierTemplateId,
ManualRedactions manualRedactions,
String fileId,
Set<String> entitiesWithComments,
boolean fileIsApproved) {
log.info("Migrating entities for file {}", fileId);
List<MigrationEntity> entitiesToMigrate = calculateMigrationEntitiesFromRedactionLog(redactionLog, document, dossierTemplateId, fileId);
List<MigrationEntity> entitiesToMigrate = calculateMigrationEntitiesFromRedactionLog(redactionLog, document, dossierTemplateId);
MigratedIds migratedIds = entitiesToMigrate.stream() MigratedIds migratedIds = entitiesToMigrate.stream()
.collect(new MigratedIdsCollector()); .collect(new MigratedIdsCollector());
applyManualChanges(entitiesToMigrate, manualRedactions); log.info("applying manual changes to migrated entities for file {}", fileId);
applyLocalProcessedManualChanges(entitiesToMigrate, manualRedactions, fileIsApproved);
EntityLog entityLog = new EntityLog(); EntityLog entityLog = new EntityLog();
entityLog.setAnalysisNumber(redactionLog.getAnalysisNumber()); entityLog.setAnalysisNumber(redactionLog.getAnalysisNumber());
@ -85,11 +91,13 @@ public class RedactionLogToEntityLogMigrationService {
.toList()); .toList());
Map<String, String> oldToNewIDMapping = migratedIds.buildOldToNewMapping(); Map<String, String> oldToNewIDMapping = migratedIds.buildOldToNewMapping();
log.info("Writing migrated entities to entityLog for file {}", fileId);
entityLog.setEntityLogEntry(entitiesToMigrate.stream() entityLog.setEntityLogEntry(entitiesToMigrate.stream()
.map(migrationEntity -> migrationEntity.toEntityLogEntry(oldToNewIDMapping)) .map(migrationEntity -> migrationEntity.toEntityLogEntry(oldToNewIDMapping))
.toList()); .toList());
if (getNumberOfApprovedEntries(redactionLog) != entityLog.getEntityLogEntry().size()) { if (getNumberOfApprovedEntries(redactionLog, document.getNumberOfPages()) != entityLog.getEntityLogEntry().size()) {
String message = String.format("Not all entities have been found during the migration redactionLog has %d entries and new entityLog %d", String message = String.format("Not all entities have been found during the migration redactionLog has %d entries and new entityLog %d",
redactionLog.getRedactionLogEntry().size(), redactionLog.getRedactionLogEntry().size(),
entityLog.getEntityLogEntry().size()); entityLog.getEntityLogEntry().size());
@ -97,60 +105,99 @@ public class RedactionLogToEntityLogMigrationService {
throw new AssertionError(message); throw new AssertionError(message);
} }
Set<String> entitiesWithUnprocessedChanges = manualRedactions.buildAll()
.stream()
.filter(manualRedaction -> manualRedaction.getProcessedDate() == null)
.map(BaseAnnotation::getAnnotationId)
.collect(Collectors.toSet());
MigratedIds idsToMigrateInDb = entitiesToMigrate.stream() MigratedIds idsToMigrateInDb = entitiesToMigrate.stream()
.filter(MigrationEntity::hasManualChangesOrComments) .filter(migrationEntity -> migrationEntity.hasManualChangesOrComments(entitiesWithComments, entitiesWithUnprocessedChanges))
.filter(m -> !m.getOldId().equals(m.getNewId())) .filter(m -> !m.getOldId().equals(m.getNewId()))
.collect(new MigratedIdsCollector()); .collect(new MigratedIdsCollector());
List<ManualRedactionEntry> manualRedactionEntriesToAdd = entitiesToMigrate.stream()
.filter(MigrationEntity::needsManualEntry)
.map(MigrationEntity::buildManualRedactionEntry)
.toList();
idsToMigrateInDb.setManualRedactionEntriesToAdd(manualRedactionEntriesToAdd);
List<String> manualForceRedactionIdsToDelete = entitiesToMigrate.stream()
.filter(MigrationEntity::needsForceDeletion)
.map(MigrationEntity::getNewId)
.toList();
idsToMigrateInDb.setForceRedactionIdsToDelete(manualForceRedactionIdsToDelete);
return new MigratedEntityLog(idsToMigrateInDb, entityLog); return new MigratedEntityLog(idsToMigrateInDb, entityLog);
} }
private void applyManualChanges(List<MigrationEntity> entitiesToMigrate, ManualRedactions manualRedactions) { private void applyLocalProcessedManualChanges(List<MigrationEntity> entitiesToMigrate, ManualRedactions manualRedactions, boolean fileIsApproved) {
if (manualRedactions == null) { if (manualRedactions == null) {
return; return;
} }
Map<String, List<BaseAnnotation>> manualChangesPerAnnotationId;
Map<String, List<BaseAnnotation>> manualChangesPerAnnotationId = Stream.of(manualRedactions.getIdsToRemove(), if (fileIsApproved) {
manualRedactions.getEntriesToAdd(), manualChangesPerAnnotationId = manualRedactions.buildAll()
manualRedactions.getForceRedactions(), .stream()
manualRedactions.getResizeRedactions(), .filter(manualChange -> (manualChange.getProcessedDate() != null && manualChange.isLocal()) //
manualRedactions.getLegalBasisChanges(), // unprocessed dict change of type IdRemoval or ManualResize must be applied for approved documents
manualRedactions.getRecategorizations(), || (manualChange.getProcessedDate() == null && !manualChange.isLocal() //
manualRedactions.getLegalBasisChanges()) && (manualChange instanceof IdRemoval || manualChange instanceof ManualResizeRedaction)))
.flatMap(Collection::stream) .map(this::convertPendingDictChangesToLocal)
.collect(Collectors.groupingBy(BaseAnnotation::getAnnotationId)); .collect(Collectors.groupingBy(BaseAnnotation::getAnnotationId));
} else {
manualChangesPerAnnotationId = manualRedactions.buildAll()
.stream()
.filter(manualChange -> manualChange.getProcessedDate() != null)
.filter(BaseAnnotation::isLocal)
.collect(Collectors.groupingBy(BaseAnnotation::getAnnotationId));
}
entitiesToMigrate.forEach(migrationEntity -> migrationEntity.applyManualChanges(manualChangesPerAnnotationId.getOrDefault(migrationEntity.getOldId(),
Collections.emptyList()),
manualChangesApplicationService));
entitiesToMigrate.forEach(migrationEntity -> manualChangesPerAnnotationId.getOrDefault(migrationEntity.getOldId(), Collections.emptyList())
.forEach(manualChange -> {
if (manualChange instanceof ManualResizeRedaction manualResizeRedaction && migrationEntity.getMigratedEntity() instanceof TextEntity textEntity) {
ManualResizeRedaction migratedManualResizeRedaction = ManualResizeRedaction.builder()
.positions(manualResizeRedaction.getPositions())
.annotationId(migrationEntity.getNewId())
.updateDictionary(manualResizeRedaction.getUpdateDictionary())
.addToAllDossiers(manualResizeRedaction.isAddToAllDossiers())
.textAfter(manualResizeRedaction.getTextAfter())
.textBefore(manualResizeRedaction.getTextBefore())
.build();
manualChangesApplicationService.resize(textEntity, migratedManualResizeRedaction);
} else {
migrationEntity.getMigratedEntity().getManualOverwrite().addChange(manualChange);
}
}));
} }
private static long getNumberOfApprovedEntries(RedactionLog redactionLog) { private BaseAnnotation convertPendingDictChangesToLocal(BaseAnnotation baseAnnotation) {
return redactionLog.getRedactionLogEntry().size(); if (baseAnnotation.getProcessedDate() != null) {
return baseAnnotation;
}
if (baseAnnotation.isLocal()) {
return baseAnnotation;
}
if (baseAnnotation instanceof ManualResizeRedaction manualResizeRedaction) {
manualResizeRedaction.setAddToAllDossiers(false);
manualResizeRedaction.setUpdateDictionary(false);
} else if (baseAnnotation instanceof IdRemoval idRemoval) {
idRemoval.setRemoveFromAllDossiers(false);
idRemoval.setRemoveFromDictionary(false);
}
return baseAnnotation;
} }
private List<MigrationEntity> calculateMigrationEntitiesFromRedactionLog(RedactionLog redactionLog, Document document, String dossierTemplateId) { private long getNumberOfApprovedEntries(RedactionLog redactionLog, int numberOfPages) {
List<MigrationEntity> images = getImageBasedMigrationEntities(redactionLog, document, dossierTemplateId); return redactionLog.getRedactionLogEntry()
List<MigrationEntity> textMigrationEntities = getTextBasedMigrationEntities(redactionLog, document, dossierTemplateId); .stream()
.filter(redactionLogEntry -> isOnExistingPage(redactionLogEntry, numberOfPages))
.count();
}
private List<MigrationEntity> calculateMigrationEntitiesFromRedactionLog(RedactionLog redactionLog, Document document, String dossierTemplateId, String fileId) {
List<MigrationEntity> images = getImageBasedMigrationEntities(redactionLog, document, fileId, dossierTemplateId);
List<MigrationEntity> textMigrationEntities = getTextBasedMigrationEntities(redactionLog, document, dossierTemplateId, fileId);
return Stream.of(textMigrationEntities.stream(), images.stream()) return Stream.of(textMigrationEntities.stream(), images.stream())
.flatMap(Function.identity()) .flatMap(Function.identity())
.toList(); .toList();
@ -163,7 +210,7 @@ public class RedactionLogToEntityLogMigrationService {
} }
private List<MigrationEntity> getImageBasedMigrationEntities(RedactionLog redactionLog, Document document, String dossierTemplateId) { private List<MigrationEntity> getImageBasedMigrationEntities(RedactionLog redactionLog, Document document, String fileId, String dossierTemplateId) {
List<Image> images = document.streamAllImages() List<Image> images = document.streamAllImages()
.collect(Collectors.toList()); .collect(Collectors.toList());
@ -195,7 +242,8 @@ public class RedactionLogToEntityLogMigrationService {
} }
String ruleIdentifier; String ruleIdentifier;
String reason = Optional.ofNullable(redactionLogImage.getReason()).orElse(""); String reason = Optional.ofNullable(redactionLogImage.getReason())
.orElse("");
if (redactionLogImage.getMatchedRule().isBlank() || redactionLogImage.getMatchedRule() == null) { if (redactionLogImage.getMatchedRule().isBlank() || redactionLogImage.getMatchedRule() == null) {
ruleIdentifier = "OLDIMG.0.0"; ruleIdentifier = "OLDIMG.0.0";
} else { } else {
@ -209,7 +257,7 @@ public class RedactionLogToEntityLogMigrationService {
} else { } else {
closestImage.skip(ruleIdentifier, reason); closestImage.skip(ruleIdentifier, reason);
} }
migrationEntities.add(new MigrationEntity(null, redactionLogImage, closestImage, redactionLogImage.getId(), closestImage.getId())); migrationEntities.add(MigrationEntity.fromRedactionLogImage(redactionLogImage, closestImage, fileId, dictionaryService, dossierTemplateId));
} }
return migrationEntities; return migrationEntities;
} }
@ -250,40 +298,21 @@ public class RedactionLogToEntityLogMigrationService {
} }
private List<MigrationEntity> getTextBasedMigrationEntities(RedactionLog redactionLog, Document document, String dossierTemplateId) { private List<MigrationEntity> getTextBasedMigrationEntities(RedactionLog redactionLog, Document document, String dossierTemplateId, String fileId) {
List<MigrationEntity> entitiesToMigrate = redactionLog.getRedactionLogEntry() List<MigrationEntity> entitiesToMigrate = redactionLog.getRedactionLogEntry()
.stream() .stream()
.filter(redactionLogEntry -> !redactionLogEntry.isImage()) .filter(redactionLogEntry -> !redactionLogEntry.isImage())
.map(entry -> MigrationEntity.fromRedactionLogEntry(entry, dictionaryService.isHint(entry.getType(), dossierTemplateId))) .filter(redactionLogEntry -> isOnExistingPage(redactionLogEntry, document.getNumberOfPages()))
.peek(migrationEntity -> { .map(entry -> MigrationEntity.fromRedactionLogEntry(entry, fileId, dictionaryService, dossierTemplateId))
if (migrationEntity.getPrecursorEntity().getEntityType().equals(EntityType.HINT) &&//
!migrationEntity.getRedactionLogEntry().isHint() &&//
!migrationEntity.getRedactionLogEntry().isRedacted()) {
migrationEntity.getPrecursorEntity().ignore(migrationEntity.getPrecursorEntity().getRuleIdentifier(), migrationEntity.getPrecursorEntity().getReason());
} else if (migrationEntity.getRedactionLogEntry().lastChangeIsRemoved()) {
migrationEntity.getPrecursorEntity().remove(migrationEntity.getPrecursorEntity().getRuleIdentifier(), migrationEntity.getPrecursorEntity().getReason());
} else if (lastManualChangeIsRemove(migrationEntity)) {
migrationEntity.getPrecursorEntity().ignore(migrationEntity.getPrecursorEntity().getRuleIdentifier(), migrationEntity.getPrecursorEntity().getReason());
} else if (migrationEntity.getPrecursorEntity().isApplied() && migrationEntity.getRedactionLogEntry().isRecommendation()) {
migrationEntity.getPrecursorEntity()
.skip(migrationEntity.getPrecursorEntity().getRuleIdentifier(), migrationEntity.getPrecursorEntity().getReason());
} else if (migrationEntity.getPrecursorEntity().isApplied()) {
migrationEntity.getPrecursorEntity()
.apply(migrationEntity.getPrecursorEntity().getRuleIdentifier(),
migrationEntity.getPrecursorEntity().getReason(),
migrationEntity.getPrecursorEntity().getLegalBasis());
} else {
migrationEntity.getPrecursorEntity()
.skip(migrationEntity.getPrecursorEntity().getRuleIdentifier(), migrationEntity.getPrecursorEntity().getReason());
}
})
.toList(); .toList();
Map<String, List<TextEntity>> tempEntitiesByValue = entityFindingUtility.findAllPossibleEntitiesAndGroupByValue(document, List<PrecursorEntity> precursorEntities = entitiesToMigrate.stream()
entitiesToMigrate.stream() .map(MigrationEntity::getPrecursorEntity)
.map(MigrationEntity::getPrecursorEntity) .toList();
.toList());
log.info("Finding all possible entities");
Map<String, List<TextEntity>> tempEntitiesByValue = entityFindingUtility.findAllPossibleEntitiesAndGroupByValue(document, precursorEntities);
for (MigrationEntity migrationEntity : entitiesToMigrate) { for (MigrationEntity migrationEntity : entitiesToMigrate) {
Optional<TextEntity> optionalTextEntity = entityFindingUtility.findClosestEntityAndReturnEmptyIfNotFound(migrationEntity.getPrecursorEntity(), Optional<TextEntity> optionalTextEntity = entityFindingUtility.findClosestEntityAndReturnEmptyIfNotFound(migrationEntity.getPrecursorEntity(),
@ -297,45 +326,35 @@ public class RedactionLogToEntityLogMigrationService {
continue; continue;
} }
TextEntity entity = createCorrectEntity(migrationEntity.getPrecursorEntity(), document, optionalTextEntity.get().getTextRange()); TextEntity migratedEntity = EntityFromPrecursorCreationService.createCorrectEntity(migrationEntity.getPrecursorEntity(), optionalTextEntity.get(), true);
migrationEntity.setMigratedEntity(entity);
migrationEntity.setOldId(migrationEntity.getPrecursorEntity().getId());
migrationEntity.setNewId(entity.getId()); // Can only be on one page, since redactionLogEntries can only be on one page
migrationEntity.setMigratedEntity(migratedEntity);
migrationEntity.setOldId(migrationEntity.getPrecursorEntity().getId());
migrationEntity.setNewId(migratedEntity.getId());
} }
tempEntitiesByValue.values() tempEntitiesByValue.values()
.stream() .stream()
.flatMap(Collection::stream) .flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph); .forEach(TextEntity::removeFromGraph);
return entitiesToMigrate; return entitiesToMigrate;
} }
private static boolean lastManualChangeIsRemove(MigrationEntity migrationEntity) { private boolean isOnExistingPage(RedactionLogEntry redactionLogEntry, int numberOfPages) {
if (migrationEntity.getRedactionLogEntry().getManualChanges() == null) { var pages = redactionLogEntry.getPositions()
return false;
}
return migrationEntity.getRedactionLogEntry().getManualChanges()
.stream() .stream()
.reduce((a, b) -> b) .map(Rectangle::getPage)
.map(m -> m.getManualRedactionType().equals(ManualRedactionType.REMOVE_LOCALLY)) .collect(Collectors.toSet());
.orElse(false);
}
for (int page : pages) {
private TextEntity createCorrectEntity(PrecursorEntity precursorEntity, SemanticNode node, TextRange closestTextRange) { if (page > numberOfPages) {
return false;
EntityCreationService entityCreationService = new EntityCreationService(entityEnrichmentService); }
TextEntity correctEntity = entityCreationService.forceByTextRange(closestTextRange, precursorEntity.getType(), precursorEntity.getEntityType(), node); }
return true;
correctEntity.addMatchedRules(precursorEntity.getMatchedRuleList());
correctEntity.setDictionaryEntry(precursorEntity.isDictionaryEntry());
correctEntity.setDossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry());
correctEntity.getManualOverwrite().addChanges(precursorEntity.getManualOverwrite().getManualChangeLog());
return correctEntity;
} }
} }

View File

@ -1,7 +1,10 @@
package com.iqser.red.service.redaction.v1.server.model; package com.iqser.red.service.redaction.v1.server.model;
import java.util.List;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import lombok.AllArgsConstructor; import lombok.AllArgsConstructor;
import lombok.Builder; import lombok.Builder;
@ -16,5 +19,4 @@ public class MigratedEntityLog {
MigratedIds migratedIds; MigratedIds migratedIds;
EntityLog entityLog; EntityLog entityLog;
} }

View File

@ -1,65 +1,151 @@
package com.iqser.red.service.redaction.v1.server.model; package com.iqser.red.service.redaction.v1.server.model;
import static com.iqser.red.service.redaction.v1.server.service.EntityLogCreatorService.buildEntryState;
import static com.iqser.red.service.redaction.v1.server.service.EntityLogCreatorService.buildEntryType;
import java.awt.geom.Rectangle2D;
import java.time.OffsetDateTime;
import java.util.Collections; import java.util.Collections;
import java.util.LinkedList;
import java.util.List; import java.util.List;
import java.util.Locale;
import java.util.Map; import java.util.Map;
import java.util.Optional; import java.util.Optional;
import java.util.Set; import java.util.Set;
import java.util.stream.Collectors; import java.util.stream.Collectors;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryType; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualChangeFactory;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.type.DictionaryEntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import com.iqser.red.service.redaction.v1.server.migration.MigrationMapper;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType; import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity;
import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualChangeOverwrite; import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualChangeOverwrite;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.service.ManualChangeFactory; import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import lombok.AllArgsConstructor; import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data; import lombok.Data;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Data @Data
@Builder
@AllArgsConstructor @AllArgsConstructor
@RequiredArgsConstructor @RequiredArgsConstructor
public final class MigrationEntity { public final class MigrationEntity {
private final PrecursorEntity precursorEntity; private final PrecursorEntity precursorEntity;
private final RedactionLogEntry redactionLogEntry; private final RedactionLogEntry redactionLogEntry;
private final DictionaryService dictionaryService;
private final String dossierTemplateId;
private IEntity migratedEntity; private IEntity migratedEntity;
private String oldId; private String oldId;
private String newId; private String newId;
private String fileId;
@Builder.Default
List<BaseAnnotation> manualChanges = new LinkedList<>();
public static MigrationEntity fromRedactionLogEntry(RedactionLogEntry redactionLogEntry, boolean hint) { public static MigrationEntity fromRedactionLogEntry(RedactionLogEntry redactionLogEntry, String fileId, DictionaryService dictionaryService, String dossierTemplateId) {
return new MigrationEntity(createPrecursorEntity(redactionLogEntry, hint), redactionLogEntry); boolean hint = dictionaryService.isHint(redactionLogEntry.getType(), dossierTemplateId);
PrecursorEntity precursorEntity = createPrecursorEntity(redactionLogEntry, hint);
if (precursorEntity.getEntityType().equals(EntityType.HINT) && !redactionLogEntry.isHint() && !redactionLogEntry.isRedacted()) {
precursorEntity.ignore(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (redactionLogEntry.lastChangeIsRemoved()) {
precursorEntity.remove(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (lastManualChangeIsRemove(redactionLogEntry)) {
precursorEntity.ignore(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (precursorEntity.isApplied() && redactionLogEntry.isRecommendation()) {
precursorEntity.skip(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (precursorEntity.isApplied()) {
precursorEntity.apply(precursorEntity.getRuleIdentifier(), precursorEntity.getReason(), precursorEntity.getLegalBasis());
} else {
precursorEntity.skip(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
}
return MigrationEntity.builder()
.precursorEntity(precursorEntity)
.redactionLogEntry(redactionLogEntry)
.oldId(redactionLogEntry.getId())
.fileId(fileId)
.dictionaryService(dictionaryService)
.dossierTemplateId(dossierTemplateId)
.build();
}
public static MigrationEntity fromRedactionLogImage(RedactionLogEntry redactionLogImage,
Image image,
String fileId,
DictionaryService dictionaryService,
String dossierTemplateId) {
return MigrationEntity.builder()
.redactionLogEntry(redactionLogImage)
.migratedEntity(image)
.oldId(redactionLogImage.getId())
.newId(image.getId())
.fileId(fileId)
.dictionaryService(dictionaryService)
.dossierTemplateId(dossierTemplateId)
.build();
}
private static boolean lastManualChangeIsRemove(RedactionLogEntry redactionLogEntry) {
if (redactionLogEntry.getManualChanges() == null) {
return false;
}
return redactionLogEntry.getManualChanges()
.stream()
.reduce((a, b) -> b)
.map(m -> m.getManualRedactionType().equals(ManualRedactionType.REMOVE_LOCALLY))
.orElse(false);
} }
public static PrecursorEntity createPrecursorEntity(RedactionLogEntry redactionLogEntry, boolean hint) { public static PrecursorEntity createPrecursorEntity(RedactionLogEntry redactionLogEntry, boolean hint) {
String ruleIdentifier = buildRuleIdentifier(redactionLogEntry); String ruleIdentifier = buildRuleIdentifier(redactionLogEntry);
List<RectangleWithPage> rectangleWithPages = redactionLogEntry.getPositions().stream().map(RectangleWithPage::fromRedactionLogRectangle).toList(); List<RectangleWithPage> rectangleWithPages = redactionLogEntry.getPositions()
.stream()
.map(RectangleWithPage::fromRedactionLogRectangle)
.toList();
EntityType entityType = getEntityType(redactionLogEntry, hint); EntityType entityType = getEntityType(redactionLogEntry, hint);
return PrecursorEntity.builder() return PrecursorEntity.builder()
.id(redactionLogEntry.getId()) .id(redactionLogEntry.getId())
.value(redactionLogEntry.getValue()) .value(redactionLogEntry.getValue())
.entityPosition(rectangleWithPages) .entityPosition(rectangleWithPages)
.ruleIdentifier(ruleIdentifier) .ruleIdentifier(ruleIdentifier)
.reason(Optional.ofNullable(redactionLogEntry.getReason()).orElse("")) .reason(Optional.ofNullable(redactionLogEntry.getReason())
.orElse(""))
.legalBasis(redactionLogEntry.getLegalBasis()) .legalBasis(redactionLogEntry.getLegalBasis())
.type(redactionLogEntry.getType()) .type(redactionLogEntry.getType())
.section(redactionLogEntry.getSection()) .section(redactionLogEntry.getSection())
.engines(MigrationMapper.getMigratedEngines(redactionLogEntry))
.entityType(entityType) .entityType(entityType)
.applied(redactionLogEntry.isRedacted()) .applied(redactionLogEntry.isRedacted())
.isDictionaryEntry(redactionLogEntry.isDictionaryEntry()) .isDictionaryEntry(redactionLogEntry.isDictionaryEntry())
@ -100,62 +186,6 @@ public final class MigrationEntity {
} }
private static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change toEntityLogChanges(Change change) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change(change.getAnalysisNumber(),
toEntityLogType(change.getType()),
change.getDateTime());
}
private static EntryType getEntryType(EntityType entityType) {
return switch (entityType) {
case ENTITY -> EntryType.ENTITY;
case HINT -> EntryType.HINT;
case FALSE_POSITIVE -> EntryType.FALSE_POSITIVE;
case RECOMMENDATION -> EntryType.RECOMMENDATION;
case FALSE_RECOMMENDATION -> EntryType.FALSE_RECOMMENDATION;
};
}
private static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange toEntityLogManualChanges(ManualChange manualChange) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange(toManualRedactionType(manualChange.getManualRedactionType()),
manualChange.getProcessedDate(),
manualChange.getRequestedDate(),
manualChange.getUserId(),
manualChange.getPropertyChanges());
}
private static ChangeType toEntityLogType(com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType type) {
return switch (type) {
case ADDED -> ChangeType.ADDED;
case REMOVED -> ChangeType.REMOVED;
case CHANGED -> ChangeType.CHANGED;
};
}
private static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType toManualRedactionType(ManualRedactionType manualRedactionType) {
return switch (manualRedactionType) {
case ADD_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD_LOCALLY;
case ADD_TO_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD_TO_DICTIONARY;
case REMOVE_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE_LOCALLY;
case REMOVE_FROM_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE_FROM_DICTIONARY;
case FORCE_REDACT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE_REDACT;
case FORCE_HINT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE_HINT;
case RECATEGORIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RECATEGORIZE;
case LEGAL_BASIS_CHANGE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.LEGAL_BASIS_CHANGE;
case RESIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RESIZE;
};
}
public EntityLogEntry toEntityLogEntry(Map<String, String> oldToNewIdMapping) { public EntityLogEntry toEntityLogEntry(Map<String, String> oldToNewIdMapping) {
EntityLogEntry entityLogEntry; EntityLogEntry entityLogEntry;
@ -169,20 +199,37 @@ public final class MigrationEntity {
throw new UnsupportedOperationException("Unknown subclass " + migratedEntity.getClass()); throw new UnsupportedOperationException("Unknown subclass " + migratedEntity.getClass());
} }
entityLogEntry.setManualChanges(ManualChangeFactory.toManualChangeList(migratedEntity.getManualOverwrite().getManualChangeLog(), redactionLogEntry.isHint())); entityLogEntry.setManualChanges(ManualChangeFactory.toLocalManualChangeList(migratedEntity.getManualOverwrite().getManualChangeLog(), true));
entityLogEntry.setColor(redactionLogEntry.getColor()); entityLogEntry.setColor(redactionLogEntry.getColor());
entityLogEntry.setChanges(redactionLogEntry.getChanges().stream().map(MigrationEntity::toEntityLogChanges).toList()); entityLogEntry.setChanges(redactionLogEntry.getChanges()
.stream()
.map(MigrationMapper::toEntityLogChanges)
.toList());
entityLogEntry.setReference(migrateSetOfIds(redactionLogEntry.getReference(), oldToNewIdMapping)); entityLogEntry.setReference(migrateSetOfIds(redactionLogEntry.getReference(), oldToNewIdMapping));
entityLogEntry.setImportedRedactionIntersections(migrateSetOfIds(redactionLogEntry.getImportedRedactionIntersections(), oldToNewIdMapping)); entityLogEntry.setImportedRedactionIntersections(migrateSetOfIds(redactionLogEntry.getImportedRedactionIntersections(), oldToNewIdMapping));
entityLogEntry.setEngines(getMigratedEngines(redactionLogEntry)); entityLogEntry.setEngines(MigrationMapper.getMigratedEngines(redactionLogEntry));
if (redactionLogEntry.getLegalBasis() != null) {
entityLogEntry.setLegalBasis(redactionLogEntry.getLegalBasis());
}
if (entityLogEntry.getEntryType().equals(EntryType.HINT) && lastManualChangeIsRemoveLocally(entityLogEntry)) { if (entityLogEntry.getEntryType().equals(EntryType.HINT) && lastManualChangeIsRemoveLocally(entityLogEntry)) {
entityLogEntry.setState(EntryState.IGNORED); entityLogEntry.setState(EntryState.IGNORED);
} }
if (redactionLogEntry.isImported() && redactionLogEntry.getValue() == null) {
entityLogEntry.setValue("Imported Redaction");
}
if (entityLogEntry.getChanges() != null && !entityLogEntry.getChanges().isEmpty() && entityLogEntry.getChanges()
.stream()
.map(Change::getType)
.toList()
.get(entityLogEntry.getChanges().size() - 1).equals(ChangeType.REMOVED)) {
entityLogEntry.setState(EntryState.REMOVED);
if (!entityLogEntry.getManualChanges().isEmpty()) {
entityLogEntry.getManualChanges()
.removeIf(manualChange -> manualChange.getManualRedactionType()
.equals(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE));
}
}
return entityLogEntry; return entityLogEntry;
} }
@ -192,69 +239,45 @@ public final class MigrationEntity {
return entityLogEntry.getManualChanges() return entityLogEntry.getManualChanges()
.stream() .stream()
.reduce((a, b) -> b) .reduce((a, b) -> b)
.filter(mc -> mc.getManualRedactionType() .filter(mc -> mc.getManualRedactionType().equals(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE))
.equals(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE_LOCALLY))
.isPresent(); .isPresent();
} }
private List<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange> migrateManualChanges(List<ManualChange> manualChanges) {
if (manualChanges == null) {
return Collections.emptyList();
}
return manualChanges.stream().map(MigrationEntity::toEntityLogManualChanges).toList();
}
private static Set<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine> getMigratedEngines(RedactionLogEntry entry) {
if (entry.getEngines() == null) {
return Collections.emptySet();
}
return entry.getEngines().stream().map(MigrationEntity::toEntityLogEngine).collect(Collectors.toSet());
}
private Set<String> migrateSetOfIds(Set<String> ids, Map<String, String> oldToNewIdMapping) { private Set<String> migrateSetOfIds(Set<String> ids, Map<String, String> oldToNewIdMapping) {
if (ids == null) { if (ids == null) {
return Collections.emptySet(); return Collections.emptySet();
} }
return ids.stream().map(oldToNewIdMapping::get).collect(Collectors.toSet()); return ids.stream()
} .map(oldToNewIdMapping::get)
.collect(Collectors.toSet());
private static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine toEntityLogEngine(Engine engine) {
return switch (engine) {
case DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.DICTIONARY;
case NER -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.NER;
case RULE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.RULE;
};
} }
public EntityLogEntry createEntityLogEntry(Image image) { public EntityLogEntry createEntityLogEntry(Image image) {
String imageType = image.getImageType().equals(ImageType.OTHER) ? "image" : image.getImageType().toString().toLowerCase(Locale.ENGLISH);
List<Position> positions = getPositionsFromOverride(image).orElse(List.of(new Position(image.getPosition(), image.getPage().getNumber()))); List<Position> positions = getPositionsFromOverride(image).orElse(List.of(new Position(image.getPosition(), image.getPage().getNumber())));
return EntityLogEntry.builder() return EntityLogEntry.builder()
.id(image.getId()) .id(image.getId())
.value(image.value()) .value(image.getValue())
.type(image.type()) .type(imageType)
.reason(image.buildReasonWithManualChangeDescriptions()) .reason(image.buildReasonWithManualChangeDescriptions())
.legalBasis(image.legalBasis()) .legalBasis(image.getManualOverwrite().getLegalBasis()
.orElse(redactionLogEntry.getLegalBasis()))
.matchedRule(image.getMatchedRule().getRuleIdentifier().toString()) .matchedRule(image.getMatchedRule().getRuleIdentifier().toString())
.dictionaryEntry(false) .dictionaryEntry(false)
.positions(positions) .positions(positions)
.containingNodeId(image.getTreeId()) .containingNodeId(image.getTreeId())
.closestHeadline(image.getHeadline().getTextBlock().getSearchText()) .closestHeadline(image.getHeadline().getTextBlock().getSearchText())
.section(redactionLogEntry.getSection()) .section(image.getManualOverwrite().getSection()
.orElse(redactionLogEntry.getSection()))
.textAfter(redactionLogEntry.getTextAfter()) .textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore()) .textBefore(redactionLogEntry.getTextBefore())
.imageHasTransparency(image.isTransparent()) .imageHasTransparency(image.isTransparent())
.state(buildEntryState(image)) .state(buildEntryState(image))
.entryType(redactionLogEntry.isHint() ? EntryType.IMAGE_HINT : EntryType.IMAGE) .entryType(dictionaryService.isHint(imageType, dossierTemplateId) ? EntryType.IMAGE_HINT : EntryType.IMAGE)
.build(); .build();
} }
@ -265,12 +288,14 @@ public final class MigrationEntity {
return EntityLogEntry.builder() return EntityLogEntry.builder()
.id(precursorEntity.getId()) .id(precursorEntity.getId())
.reason(precursorEntity.buildReasonWithManualChangeDescriptions()) .reason(precursorEntity.buildReasonWithManualChangeDescriptions())
.legalBasis(precursorEntity.legalBasis()) .legalBasis(precursorEntity.getManualOverwrite().getLegalBasis()
.orElse(redactionLogEntry.getLegalBasis()))
.value(precursorEntity.value()) .value(precursorEntity.value())
.type(precursorEntity.type()) .type(precursorEntity.type())
.state(buildEntryState(precursorEntity)) .state(buildEntryState(precursorEntity))
.entryType(buildEntryType(precursorEntity)) .entryType(buildEntryType(precursorEntity))
.section(redactionLogEntry.getSection()) .section(precursorEntity.getManualOverwrite().getSection()
.orElse(redactionLogEntry.getSection()))
.textAfter(redactionLogEntry.getTextAfter()) .textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore()) .textBefore(redactionLogEntry.getTextBefore())
.containingNodeId(Collections.emptyList()) .containingNodeId(Collections.emptyList())
@ -280,12 +305,11 @@ public final class MigrationEntity {
.dossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry()) .dossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry())
.startOffset(-1) .startOffset(-1)
.endOffset(-1) .endOffset(-1)
.positions(precursorEntity.getManualOverwrite() .positions(precursorEntity.getManualOverwrite().getPositions()
.getPositions() .orElse(precursorEntity.getEntityPosition())
.orElse(precursorEntity.getEntityPosition()) .stream()
.stream() .map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber()))
.map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber())) .toList())
.toList())
.engines(Collections.emptySet()) .engines(Collections.emptySet())
.build(); .build();
} }
@ -299,12 +323,15 @@ public final class MigrationEntity {
.id(entity.getId()) .id(entity.getId())
.positions(rectanglesPerLine) .positions(rectanglesPerLine)
.reason(entity.buildReasonWithManualChangeDescriptions()) .reason(entity.buildReasonWithManualChangeDescriptions())
.legalBasis(entity.legalBasis()) .legalBasis(entity.getManualOverwrite().getLegalBasis()
.value(entity.getManualOverwrite().getValue().orElse(entity.getMatchedRule().isWriteValueWithLineBreaks() ? entity.getValueWithLineBreaks() : entity.getValue())) .orElse(redactionLogEntry.getLegalBasis()))
.value(entity.getManualOverwrite().getValue()
.orElse(entity.getMatchedRule().isWriteValueWithLineBreaks() ? entity.getValueWithLineBreaks() : entity.getValue()))
.type(entity.type()) .type(entity.type())
.section(redactionLogEntry.getSection()) .section(entity.getManualOverwrite().getSection()
.textAfter(redactionLogEntry.getTextAfter()) .orElse(redactionLogEntry.getSection()))
.textBefore(redactionLogEntry.getTextBefore()) .textAfter(entity.getTextAfter())
.textBefore(entity.getTextBefore())
.containingNodeId(entity.getDeepestFullyContainingNode().getTreeId()) .containingNodeId(entity.getDeepestFullyContainingNode().getTreeId())
.closestHeadline(entity.getDeepestFullyContainingNode().getHeadline().getTextBlock().getSearchText()) .closestHeadline(entity.getDeepestFullyContainingNode().getHeadline().getTextBlock().getSearchText())
.matchedRule(entity.getMatchedRule().getRuleIdentifier().toString()) .matchedRule(entity.getMatchedRule().getRuleIdentifier().toString())
@ -322,54 +349,136 @@ public final class MigrationEntity {
private static List<Position> getRectanglesPerLine(TextEntity entity) { private static List<Position> getRectanglesPerLine(TextEntity entity) {
return getPositionsFromOverride(entity).orElse(entity.getPositionsOnPagePerPage() return getPositionsFromOverride(entity).orElse(entity.getPositionsOnPagePerPage()
.get(0) .get(0).getRectanglePerLine()
.getRectanglePerLine() .stream()
.stream() .map(rectangle2D -> new Position(rectangle2D,
.map(rectangle2D -> new Position(rectangle2D, entity.getPositionsOnPagePerPage().get(0).getPage().getNumber())) entity.getPositionsOnPagePerPage()
.toList()); .get(0).getPage().getNumber()))
.toList());
} }
private static Optional<List<Position>> getPositionsFromOverride(IEntity entity) { private static Optional<List<Position>> getPositionsFromOverride(IEntity entity) {
return entity.getManualOverwrite().getPositions().map(rects -> rects.stream().map(r -> new Position(r.rectangle2D(), r.pageNumber())).toList()); return entity.getManualOverwrite().getPositions()
.map(rects -> rects.stream()
.map(r -> new Position(r.rectangle2D(), r.pageNumber()))
.toList());
} }
private EntryState buildEntryState(IEntity entity) { public boolean hasManualChangesOrComments(Set<String> entitiesWithComments, Set<String> entitiesWithUnprocessedChanges) {
if (entity.applied() && entity.active()) {
return EntryState.APPLIED;
} else if (entity.skipped() && entity.active()) {
return EntryState.SKIPPED;
} else if (entity.ignored()) {
return EntryState.IGNORED;
} else {
return EntryState.REMOVED;
}
}
private EntryType buildEntryType(IEntity entity) {
if (entity instanceof TextEntity textEntity) {
return getEntryType(textEntity.getEntityType());
} else if (entity instanceof PrecursorEntity precursorEntity) {
if (precursorEntity.isRectangle()) {
return EntryType.AREA;
}
return getEntryType(precursorEntity.getEntityType());
} else if (entity instanceof Image) {
return EntryType.IMAGE;
}
throw new UnsupportedOperationException(String.format("Entity subclass %s is not implemented!", entity.getClass()));
}
public boolean hasManualChangesOrComments() {
return !(redactionLogEntry.getManualChanges() == null || redactionLogEntry.getManualChanges().isEmpty()) || // return !(redactionLogEntry.getManualChanges() == null || redactionLogEntry.getManualChanges().isEmpty()) || //
!(redactionLogEntry.getComments() == null || redactionLogEntry.getComments().isEmpty()); !(redactionLogEntry.getComments() == null || redactionLogEntry.getComments().isEmpty()) //
|| hasManualChanges() || entitiesWithComments.contains(oldId) || entitiesWithUnprocessedChanges.contains(oldId);
}
public boolean hasManualChanges() {
return !manualChanges.isEmpty();
}
public void applyManualChanges(List<BaseAnnotation> manualChangesToApply, ManualChangesApplicationService manualChangesApplicationService) {
manualChanges.addAll(manualChangesToApply);
manualChangesToApply.forEach(manualChange -> {
if (manualChange instanceof ManualResizeRedaction manualResizeRedaction && migratedEntity instanceof TextEntity textEntity) {
manualResizeRedaction.setAnnotationId(newId);
manualChangesApplicationService.resize(textEntity, manualResizeRedaction);
} else if (manualChange instanceof ManualRecategorization manualRecategorization && migratedEntity instanceof Image image) {
image.setImageType(ImageType.fromString(manualRecategorization.getType()));
migratedEntity.getManualOverwrite().addChange(manualChange);
} else {
migratedEntity.getManualOverwrite().addChange(manualChange);
}
});
}
public ManualRedactionEntry buildManualRedactionEntry() {
assert hasManualChanges();
// currently we need to insert a manual redaction entry, whenever an entity has been resized.
String user = manualChanges.stream()
.filter(mc -> mc instanceof ManualResizeRedaction)
.findFirst()
.orElse(manualChanges.get(0)).getUser();
OffsetDateTime requestDate = manualChanges.get(0).getRequestDate();
return ManualRedactionEntry.builder()
.annotationId(newId)
.fileId(fileId)
.user(user)
.requestDate(requestDate)
.type(redactionLogEntry.getType())
.value(redactionLogEntry.getValue())
.reason(redactionLogEntry.getReason())
.legalBasis(redactionLogEntry.getLegalBasis())
.section(redactionLogEntry.getSection())
.rectangle(false)
.addToDictionary(false)
.addToDossierDictionary(false)
.positions(buildPositions(migratedEntity))
.textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore())
.dictionaryEntryType(DictionaryEntryType.ENTRY)
.build();
}
private List<Rectangle> buildPositions(IEntity entity) {
if (entity instanceof TextEntity textEntity) {
var positionsOnPage = textEntity.getPositionsOnPagePerPage()
.get(0);
return positionsOnPage.getRectanglePerLine()
.stream()
.map(p -> new Rectangle((float) p.getX(), (float) p.getY(), (float) p.getWidth(), (float) p.getHeight(), positionsOnPage.getPage().getNumber()))
.toList();
}
if (entity instanceof PrecursorEntity pEntity) {
return pEntity.getManualOverwrite().getPositions()
.orElse(pEntity.getEntityPosition())
.stream()
.map(p -> new Rectangle((float) p.rectangle2D().getX(),
(float) p.rectangle2D().getY(),
(float) p.rectangle2D().getWidth(),
(float) p.rectangle2D().getHeight(),
p.pageNumber()))
.toList();
}
if (entity instanceof Image image) {
Rectangle2D position = image.getManualOverwrite().getPositions()
.map(p -> p.get(0).rectangle2D())
.orElse(image.getPosition());
return List.of(new Rectangle((float) position.getX(), (float) position.getY(), (float) position.getWidth(), (float) position.getHeight(), image.getPage().getNumber()));
} else {
throw new UnsupportedOperationException();
}
}
public boolean needsManualEntry() {
return manualChanges.stream()
.anyMatch(mc -> mc instanceof ManualResizeRedaction && !((ManualResizeRedaction) mc).getUpdateDictionary()) && !(migratedEntity instanceof Image);
}
public boolean needsForceDeletion() {
return manualChanges.stream()
.anyMatch(mc -> mc instanceof ManualForceRedaction) && this.precursorEntity != null && this.precursorEntity.removed();
} }
} }

View File

@ -43,7 +43,6 @@ public class PrecursorEntity implements IEntity {
String type; String type;
String section; String section;
EntityType entityType; EntityType entityType;
EntryType entryType;
boolean applied; boolean applied;
boolean isDictionaryEntry; boolean isDictionaryEntry;
boolean isDossierDictionaryEntry; boolean isDossierDictionaryEntry;
@ -61,8 +60,8 @@ public class PrecursorEntity implements IEntity {
.stream() .stream()
.map(RectangleWithPage::fromAnnotationRectangle) .map(RectangleWithPage::fromAnnotationRectangle)
.toList(); .toList();
var entityType = hint ? EntityType.HINT : EntityType.ENTITY; var entityType = hint ? EntityType.HINT : EntityType.ENTITY;
var entryType = hint ? EntryType.HINT : (manualRedactionEntry.isRectangle() ? EntryType.AREA : EntryType.ENTITY);
ManualChangeOverwrite manualChangeOverwrite = new ManualChangeOverwrite(entityType); ManualChangeOverwrite manualChangeOverwrite = new ManualChangeOverwrite(entityType);
manualChangeOverwrite.addChange(manualRedactionEntry); manualChangeOverwrite.addChange(manualRedactionEntry);
return PrecursorEntity.builder() return PrecursorEntity.builder()
@ -75,7 +74,6 @@ public class PrecursorEntity implements IEntity {
.type(manualRedactionEntry.getType()) .type(manualRedactionEntry.getType())
.section(manualRedactionEntry.getSection()) .section(manualRedactionEntry.getSection())
.entityType(entityType) .entityType(entityType)
.entryType(entryType)
.applied(true) .applied(true)
.isDictionaryEntry(false) .isDictionaryEntry(false)
.isDossierDictionaryEntry(false) .isDossierDictionaryEntry(false)
@ -103,7 +101,6 @@ public class PrecursorEntity implements IEntity {
.type(entityLogEntry.getType()) .type(entityLogEntry.getType())
.section(entityLogEntry.getSection()) .section(entityLogEntry.getSection())
.entityType(entityType) .entityType(entityType)
.entryType(entityLogEntry.getEntryType())
.isDictionaryEntry(entityLogEntry.isDictionaryEntry()) .isDictionaryEntry(entityLogEntry.isDictionaryEntry())
.isDossierDictionaryEntry(entityLogEntry.isDossierDictionaryEntry()) .isDossierDictionaryEntry(entityLogEntry.isDossierDictionaryEntry())
.manualOverwrite(new ManualChangeOverwrite(entityType)) .manualOverwrite(new ManualChangeOverwrite(entityType))
@ -134,7 +131,6 @@ public class PrecursorEntity implements IEntity {
.type(Optional.ofNullable(importedRedaction.getType()) .type(Optional.ofNullable(importedRedaction.getType())
.orElse(IMPORTED_REDACTION_TYPE)) .orElse(IMPORTED_REDACTION_TYPE))
.entityType(entityType) .entityType(entityType)
.entryType(entryType)
.isDictionaryEntry(false) .isDictionaryEntry(false)
.isDossierDictionaryEntry(false) .isDossierDictionaryEntry(false)
.rectangle(value.isBlank() || entryType.equals(EntryType.IMAGE) || entryType.equals(EntryType.IMAGE_HINT) || entryType.equals(EntryType.AREA)) .rectangle(value.isBlank() || entryType.equals(EntryType.IMAGE) || entryType.equals(EntryType.IMAGE_HINT) || entryType.equals(EntryType.AREA))

View File

@ -2,6 +2,7 @@ package com.iqser.red.service.redaction.v1.server.model.document;
import static java.lang.String.format; import static java.lang.String.format;
import java.util.ArrayList;
import java.util.Collections; import java.util.Collections;
import java.util.LinkedList; import java.util.LinkedList;
import java.util.List; import java.util.List;
@ -40,7 +41,10 @@ public class DocumentTree {
public TextBlock buildTextBlock() { public TextBlock buildTextBlock() {
return allEntriesInOrder().map(Entry::getNode).filter(SemanticNode::isLeaf).map(SemanticNode::getLeafTextBlock).collect(new TextBlockCollector()); return allEntriesInOrder().map(Entry::getNode)
.filter(SemanticNode::isLeaf)
.map(SemanticNode::getLeafTextBlock)
.collect(new TextBlockCollector());
} }
@ -89,8 +93,8 @@ public class DocumentTree {
if (treeId.isEmpty()) { if (treeId.isEmpty()) {
return root != null; return root != null;
} }
Entry entry = root.children.get(treeId.get(0)); Entry entry = root;
for (int id : treeId.subList(1, treeId.size())) { for (int id : treeId) {
if (id >= entry.children.size() || 0 > id) { if (id >= entry.children.size() || 0 > id) {
return false; return false;
} }
@ -114,13 +118,78 @@ public class DocumentTree {
public Stream<SemanticNode> childNodes(List<Integer> treeId) { public Stream<SemanticNode> childNodes(List<Integer> treeId) {
return getEntryById(treeId).children.stream().map(Entry::getNode); return getEntryById(treeId).children.stream()
.map(Entry::getNode);
}
/**
* Finds all child nodes of the specified entry, whose nodes textRange intersects the given textRange. It achieves this by finding the first entry, whose textRange contains the start idx of the TextRange using a binary search.
* It then iterates over the remaining children adding them to the intersections, until one does not contain the end of the TextRange. All intersected Entries are returned as SemanticNodes.
*
* @param treeId the treeId of the Entry whose children shall be checked.
* @param textRange The TextRange to find intersecting childNodes for.
* @return A list of all SemanticNodes, that are direct children of the specified Entry, whose TextRange intersects the given TextRange
*/
public List<SemanticNode> findIntersectingChildNodes(List<Integer> treeId, TextRange textRange) {
List<Entry> childEntries = getEntryById(treeId).getChildren();
List<SemanticNode> intersectingChildEntries = new LinkedList<>();
int startIdx = findFirstIdxOfContainingChildBinarySearch(childEntries, textRange.start());
if (startIdx < 0) {
return intersectingChildEntries;
}
for (int i = startIdx; i < childEntries.size(); i++) {
if (childEntries.get(i).getNode().getTextRange().start() < textRange.end()) {
intersectingChildEntries.add(childEntries.get(i).getNode());
} else {
break;
}
}
return intersectingChildEntries;
}
public Optional<SemanticNode> findFirstContainingChild(List<Integer> treeId, TextRange textRange) {
List<Entry> childEntries = getEntryById(treeId).getChildren();
int startIdx = findFirstIdxOfContainingChildBinarySearch(childEntries, textRange.start());
if (startIdx < 0) {
return Optional.empty();
}
if (childEntries.get(startIdx).getNode().getTextRange().contains(textRange.end())) {
return Optional.of(childEntries.get(startIdx).getNode());
}
return Optional.empty();
}
private int findFirstIdxOfContainingChildBinarySearch(List<Entry> childNodes, int start) {
int low = 0;
int high = childNodes.size() - 1;
while (low <= high) {
int mid = low + (high - low) / 2;
TextRange range = childNodes.get(mid).getNode().getTextRange();
if (range.start() > start) {
high = mid - 1;
} else if (range.end() <= start) {
low = mid + 1;
} else {
return mid;
}
}
return -1;
} }
public Stream<SemanticNode> childNodesOfType(List<Integer> treeId, NodeType nodeType) { public Stream<SemanticNode> childNodesOfType(List<Integer> treeId, NodeType nodeType) {
return getEntryById(treeId).children.stream().filter(entry -> entry.node.getType().equals(nodeType)).map(Entry::getNode); return getEntryById(treeId).children.stream()
.filter(entry -> entry.node.getType().equals(nodeType))
.map(Entry::getNode);
} }
@ -199,26 +268,32 @@ public class DocumentTree {
public Stream<Entry> allEntriesInOrder() { public Stream<Entry> allEntriesInOrder() {
return Stream.of(root).flatMap(DocumentTree::flatten); return Stream.of(root)
.flatMap(DocumentTree::flatten);
} }
public Stream<Entry> allSubEntriesInOrder(List<Integer> parentId) { public Stream<Entry> allSubEntriesInOrder(List<Integer> parentId) {
return getEntryById(parentId).children.stream().flatMap(DocumentTree::flatten); return getEntryById(parentId).children.stream()
.flatMap(DocumentTree::flatten);
} }
@Override @Override
public String toString() { public String toString() {
return String.join("\n", allEntriesInOrder().map(Entry::toString).toList()); return String.join("\n",
allEntriesInOrder().map(Entry::toString)
.toList());
} }
private static Stream<Entry> flatten(Entry entry) { private static Stream<Entry> flatten(Entry entry) {
return Stream.concat(Stream.of(entry), entry.children.stream().flatMap(DocumentTree::flatten)); return Stream.concat(Stream.of(entry),
entry.children.stream()
.flatMap(DocumentTree::flatten));
} }
@ -240,7 +315,7 @@ public class DocumentTree {
List<Integer> treeId; List<Integer> treeId;
SemanticNode node; SemanticNode node;
@Builder.Default @Builder.Default
List<Entry> children = new LinkedList<>(); List<Entry> children = new ArrayList<>();
@Override @Override

View File

@ -92,12 +92,18 @@ public class TextRange implements Comparable<TextRange> {
public List<TextRange> split(List<Integer> splitIndices) { public List<TextRange> split(List<Integer> splitIndices) {
if (splitIndices.stream().anyMatch(idx -> !this.contains(idx))) { if (splitIndices.stream()
throw new IndexOutOfBoundsException(format("%s splitting indices are out of range for %s", splitIndices.stream().filter(idx -> !this.contains(idx)).toList(), this)); .anyMatch(idx -> !this.contains(idx))) {
throw new IndexOutOfBoundsException(format("%s splitting indices are out of range for %s",
splitIndices.stream()
.filter(idx -> !this.contains(idx))
.toList(),
this));
} }
List<TextRange> splitBoundaries = new LinkedList<>(); List<TextRange> splitBoundaries = new LinkedList<>();
int previousIndex = start; int previousIndex = start;
for (int splitIndex : splitIndices) { for (int i = 0, splitIndicesSize = splitIndices.size(); i < splitIndicesSize; i++) {
int splitIndex = splitIndices.get(i);
// skip split if it would produce a boundary of length 0 // skip split if it would produce a boundary of length 0
if (splitIndex == previousIndex) { if (splitIndex == previousIndex) {
@ -113,8 +119,12 @@ public class TextRange implements Comparable<TextRange> {
public static TextRange merge(Collection<TextRange> boundaries) { public static TextRange merge(Collection<TextRange> boundaries) {
int minStart = boundaries.stream().mapToInt(TextRange::start).min().orElseThrow(IllegalArgumentException::new); int minStart = boundaries.stream()
int maxEnd = boundaries.stream().mapToInt(TextRange::end).max().orElseThrow(IllegalArgumentException::new); .mapToInt(TextRange::start)
.min().orElseThrow(IllegalArgumentException::new);
int maxEnd = boundaries.stream()
.mapToInt(TextRange::end)
.max().orElseThrow(IllegalArgumentException::new);
return new TextRange(minStart, maxEnd); return new TextRange(minStart, maxEnd);
} }

View File

@ -27,12 +27,12 @@ import lombok.experimental.FieldDefaults;
public class ManualChangeOverwrite { public class ManualChangeOverwrite {
private static final Map<Class<? extends BaseAnnotation>, String> MANUAL_CHANGE_DESCRIPTIONS = Map.of(// private static final Map<Class<? extends BaseAnnotation>, String> MANUAL_CHANGE_DESCRIPTIONS = Map.of(//
ManualRedactionEntry.class, "created by manual change", // ManualRedactionEntry.class, "created by manual change", //
ManualLegalBasisChange.class, "legal basis was manually changed", // ManualLegalBasisChange.class, "legal basis was manually changed", //
ManualResizeRedaction.class, "resized by manual override", // ManualResizeRedaction.class, "resized by manual override", //
ManualForceRedaction.class, "forced by manual override", // ManualForceRedaction.class, "forced by manual override", //
IdRemoval.class, "removed by manual override", // IdRemoval.class, "removed by manual override", //
ManualRecategorization.class, "recategorized by manual override"); ManualRecategorization.class, "recategorized by manual override");
List<BaseAnnotation> manualChanges = new LinkedList<>(); List<BaseAnnotation> manualChanges = new LinkedList<>();
boolean changed; boolean changed;
@ -80,7 +80,8 @@ public class ManualChangeOverwrite {
manualChanges.sort(Comparator.comparing(BaseAnnotation::getRequestDate)); manualChanges.sort(Comparator.comparing(BaseAnnotation::getRequestDate));
updateFields(manualChanges); updateFields(manualChanges);
// make list unmodifiable. // make list unmodifiable.
return manualChanges.stream().toList(); return manualChanges.stream()
.toList();
} }
@ -121,14 +122,28 @@ public class ManualChangeOverwrite {
resized = true; resized = true;
// This is only for not found Manual Entities. // This is only for not found Manual Entities.
value = manualResizeRedaction.getValue(); value = manualResizeRedaction.getValue();
positions = manualResizeRedaction.getPositions().stream().map(RectangleWithPage::fromAnnotationRectangle).toList(); positions = manualResizeRedaction.getPositions()
.stream()
.map(RectangleWithPage::fromAnnotationRectangle)
.toList();
} }
if (manualChange instanceof ManualRecategorization recategorization) { if (manualChange instanceof ManualRecategorization recategorization) {
// recategorization logic happens in ManualChangesApplicationService. // recategorization logic happens in ManualChangesApplicationService.
recategorized = true; recategorized = true;
// this is only relevant for ManualEntities. Image and TextEntity is recategorized in the ManualChangesApplicationService. // this is only relevant for ManualEntities. Image and TextEntity is recategorized in the ManualChangesApplicationService.
type = recategorization.getType(); if (recategorization.getType() != null) {
type = recategorization.getType();
}
if (recategorization.getSection() != null) {
section = recategorization.getSection();
}
if (recategorization.getValue() != null) {
value = recategorization.getValue();
}
if (recategorization.getLegalBasis() != null) {
legalBasis = recategorization.getLegalBasis();
}
} }
descriptions.add(MANUAL_CHANGE_DESCRIPTIONS.get(manualChange.getClass())); descriptions.add(MANUAL_CHANGE_DESCRIPTIONS.get(manualChange.getClass()));

View File

@ -17,7 +17,6 @@ import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualCha
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule; import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock; import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlockCollector;
import lombok.AccessLevel; import lombok.AccessLevel;
import lombok.AllArgsConstructor; import lombok.AllArgsConstructor;
@ -39,6 +38,8 @@ public class Image implements GenericSemanticNode, IEntity {
List<Integer> treeId; List<Integer> treeId;
String id; String id;
TextBlock leafTextBlock;
ImageType imageType; ImageType imageType;
boolean transparent; boolean transparent;
Rectangle2D position; Rectangle2D position;
@ -49,14 +50,11 @@ public class Image implements GenericSemanticNode, IEntity {
@Builder.Default @Builder.Default
ManualChangeOverwrite manualOverwrite = new ManualChangeOverwrite(); ManualChangeOverwrite manualOverwrite = new ManualChangeOverwrite();
@EqualsAndHashCode.Exclude
Page page; Page page;
@EqualsAndHashCode.Exclude
DocumentTree documentTree; DocumentTree documentTree;
@Builder.Default @Builder.Default
@EqualsAndHashCode.Exclude
Set<TextEntity> entities = new HashSet<>(); Set<TextEntity> entities = new HashSet<>();
@ -70,7 +68,7 @@ public class Image implements GenericSemanticNode, IEntity {
@Override @Override
public TextBlock getTextBlock() { public TextBlock getTextBlock() {
return streamAllSubNodes().filter(SemanticNode::isLeaf).map(SemanticNode::getLeafTextBlock).collect(new TextBlockCollector()); return leafTextBlock;
} }
@ -84,14 +82,21 @@ public class Image implements GenericSemanticNode, IEntity {
@Override @Override
public TextRange getTextRange() { public TextRange getTextRange() {
return GenericSemanticNode.super.getTextRange(); return leafTextBlock.getTextRange();
}
@Override
public int length() {
return getTextRange().length();
} }
@Override @Override
public String type() { public String type() {
return getManualOverwrite().getType().orElse(imageType.toString()); return getManualOverwrite().getType().orElse(imageType.toString().toLowerCase(Locale.ENGLISH));
} }
@ -123,10 +128,4 @@ public class Image implements GenericSemanticNode, IEntity {
return name.charAt(0) + name.substring(1).toLowerCase(Locale.ENGLISH); return name.charAt(0) + name.substring(1).toLowerCase(Locale.ENGLISH);
} }
public int length() {
return 0;
}
} }

View File

@ -70,7 +70,9 @@ public interface SemanticNode {
*/ */
default Page getFirstPage() { default Page getFirstPage() {
return getTextBlock().getPages().stream().min(Comparator.comparingInt(Page::getNumber)).orElseThrow(); return getTextBlock().getPages()
.stream()
.min(Comparator.comparingInt(Page::getNumber)).orElseThrow();
} }
@ -96,7 +98,8 @@ public interface SemanticNode {
*/ */
default boolean onPage(int pageNumber) { default boolean onPage(int pageNumber) {
return getPages().stream().anyMatch(page -> page.getNumber() == pageNumber); return getPages().stream()
.anyMatch(page -> page.getNumber() == pageNumber);
} }
@ -248,7 +251,9 @@ public interface SemanticNode {
*/ */
default boolean hasEntitiesOfType(String type) { default boolean hasEntitiesOfType(String type) {
return getEntities().stream().filter(TextEntity::active).anyMatch(redactionEntity -> redactionEntity.type().equals(type)); return getEntities().stream()
.filter(TextEntity::active)
.anyMatch(redactionEntity -> redactionEntity.type().equals(type));
} }
@ -261,7 +266,10 @@ public interface SemanticNode {
*/ */
default boolean hasEntitiesOfAnyType(String... types) { default boolean hasEntitiesOfAnyType(String... types) {
return getEntities().stream().filter(TextEntity::active).anyMatch(redactionEntity -> Arrays.stream(types).anyMatch(type -> redactionEntity.type().equals(type))); return getEntities().stream()
.filter(TextEntity::active)
.anyMatch(redactionEntity -> Arrays.stream(types)
.anyMatch(type -> redactionEntity.type().equals(type)));
} }
@ -274,7 +282,12 @@ public interface SemanticNode {
*/ */
default boolean hasEntitiesOfAllTypes(String... types) { default boolean hasEntitiesOfAllTypes(String... types) {
return getEntities().stream().filter(TextEntity::active).map(TextEntity::type).collect(Collectors.toUnmodifiableSet()).containsAll(Arrays.stream(types).toList()); return getEntities().stream()
.filter(TextEntity::active)
.map(TextEntity::type)
.collect(Collectors.toUnmodifiableSet())
.containsAll(Arrays.stream(types)
.toList());
} }
@ -287,7 +300,10 @@ public interface SemanticNode {
*/ */
default List<TextEntity> getEntitiesOfType(String type) { default List<TextEntity> getEntitiesOfType(String type) {
return getEntities().stream().filter(TextEntity::active).filter(redactionEntity -> redactionEntity.type().equals(type)).toList(); return getEntities().stream()
.filter(TextEntity::active)
.filter(redactionEntity -> redactionEntity.type().equals(type))
.toList();
} }
@ -300,7 +316,10 @@ public interface SemanticNode {
*/ */
default List<TextEntity> getEntitiesOfType(List<String> types) { default List<TextEntity> getEntitiesOfType(List<String> types) {
return getEntities().stream().filter(TextEntity::active).filter(redactionEntity -> redactionEntity.isAnyType(types)).toList(); return getEntities().stream()
.filter(TextEntity::active)
.filter(redactionEntity -> redactionEntity.isAnyType(types))
.toList();
} }
@ -313,7 +332,11 @@ public interface SemanticNode {
*/ */
default List<TextEntity> getEntitiesOfType(String... types) { default List<TextEntity> getEntitiesOfType(String... types) {
return getEntities().stream().filter(TextEntity::active).filter(redactionEntity -> redactionEntity.isAnyType(Arrays.stream(types).toList())).toList(); return getEntities().stream()
.filter(TextEntity::active)
.filter(redactionEntity -> redactionEntity.isAnyType(Arrays.stream(types)
.toList()))
.toList();
} }
@ -365,7 +388,8 @@ public interface SemanticNode {
*/ */
default boolean containsAllStrings(String... strings) { default boolean containsAllStrings(String... strings) {
return Arrays.stream(strings).allMatch(this::containsString); return Arrays.stream(strings)
.allMatch(this::containsString);
} }
@ -377,7 +401,8 @@ public interface SemanticNode {
*/ */
default boolean containsAnyString(String... strings) { default boolean containsAnyString(String... strings) {
return Arrays.stream(strings).anyMatch(this::containsString); return Arrays.stream(strings)
.anyMatch(this::containsString);
} }
@ -389,7 +414,8 @@ public interface SemanticNode {
*/ */
default boolean containsAnyString(List<String> strings) { default boolean containsAnyString(List<String> strings) {
return strings.stream().anyMatch(this::containsString); return strings.stream()
.anyMatch(this::containsString);
} }
@ -413,7 +439,8 @@ public interface SemanticNode {
*/ */
default boolean containsAnyStringIgnoreCase(String... strings) { default boolean containsAnyStringIgnoreCase(String... strings) {
return Arrays.stream(strings).anyMatch(this::containsStringIgnoreCase); return Arrays.stream(strings)
.anyMatch(this::containsStringIgnoreCase);
} }
@ -425,7 +452,8 @@ public interface SemanticNode {
*/ */
default boolean containsAllStringsIgnoreCase(String... strings) { default boolean containsAllStringsIgnoreCase(String... strings) {
return Arrays.stream(strings).allMatch(this::containsStringIgnoreCase); return Arrays.stream(strings)
.allMatch(this::containsStringIgnoreCase);
} }
@ -437,7 +465,9 @@ public interface SemanticNode {
*/ */
default boolean containsWord(String word) { default boolean containsWord(String word) {
return getTextBlock().getWords().stream().anyMatch(s -> s.equals(word)); return getTextBlock().getWords()
.stream()
.anyMatch(s -> s.equals(word));
} }
@ -449,7 +479,10 @@ public interface SemanticNode {
*/ */
default boolean containsWordIgnoreCase(String word) { default boolean containsWordIgnoreCase(String word) {
return getTextBlock().getWords().stream().map(String::toLowerCase).anyMatch(s -> s.equals(word.toLowerCase(Locale.ENGLISH))); return getTextBlock().getWords()
.stream()
.map(String::toLowerCase)
.anyMatch(s -> s.equals(word.toLowerCase(Locale.ENGLISH)));
} }
@ -461,7 +494,10 @@ public interface SemanticNode {
*/ */
default boolean containsAnyWord(String... words) { default boolean containsAnyWord(String... words) {
return Arrays.stream(words).anyMatch(word -> getTextBlock().getWords().stream().anyMatch(word::equals)); return Arrays.stream(words)
.anyMatch(word -> getTextBlock().getWords()
.stream()
.anyMatch(word::equals));
} }
@ -473,7 +509,12 @@ public interface SemanticNode {
*/ */
default boolean containsAnyWordIgnoreCase(String... words) { default boolean containsAnyWordIgnoreCase(String... words) {
return Arrays.stream(words).map(String::toLowerCase).anyMatch(word -> getTextBlock().getWords().stream().map(String::toLowerCase).anyMatch(word::equals)); return Arrays.stream(words)
.map(String::toLowerCase)
.anyMatch(word -> getTextBlock().getWords()
.stream()
.map(String::toLowerCase)
.anyMatch(word::equals));
} }
@ -485,7 +526,10 @@ public interface SemanticNode {
*/ */
default boolean containsAllWords(String... words) { default boolean containsAllWords(String... words) {
return Arrays.stream(words).allMatch(word -> getTextBlock().getWords().stream().anyMatch(word::equals)); return Arrays.stream(words)
.allMatch(word -> getTextBlock().getWords()
.stream()
.anyMatch(word::equals));
} }
@ -497,7 +541,12 @@ public interface SemanticNode {
*/ */
default boolean containsAllWordsIgnoreCase(String... words) { default boolean containsAllWordsIgnoreCase(String... words) {
return Arrays.stream(words).map(String::toLowerCase).allMatch(word -> getTextBlock().getWords().stream().map(String::toLowerCase).anyMatch(word::equals)); return Arrays.stream(words)
.map(String::toLowerCase)
.allMatch(word -> getTextBlock().getWords()
.stream()
.map(String::toLowerCase)
.anyMatch(word::equals));
} }
@ -537,7 +586,11 @@ public interface SemanticNode {
*/ */
default boolean intersectsRectangle(int x, int y, int w, int h, int pageNumber) { default boolean intersectsRectangle(int x, int y, int w, int h, int pageNumber) {
return getBBox().entrySet().stream().filter(entry -> entry.getKey().getNumber() == pageNumber).map(Map.Entry::getValue).anyMatch(rect -> rect.intersects(x, y, w, h)); return getBBox().entrySet()
.stream()
.filter(entry -> entry.getKey().getNumber() == pageNumber)
.map(Map.Entry::getValue)
.anyMatch(rect -> rect.intersects(x, y, w, h));
} }
@ -556,7 +609,7 @@ public interface SemanticNode {
} }
textEntity.addIntersectingNode(this); textEntity.addIntersectingNode(this);
streamChildren().filter(semanticNode -> semanticNode.getTextRange().intersects(textEntity.getTextRange())) getDocumentTree().findIntersectingChildNodes(getTreeId(), textEntity.getTextRange())
.forEach(node -> node.addThisToEntityIfIntersects(textEntity)); .forEach(node -> node.addThisToEntityIfIntersects(textEntity));
} }
} }
@ -591,7 +644,8 @@ public interface SemanticNode {
*/ */
default Stream<SemanticNode> streamAllSubNodes() { default Stream<SemanticNode> streamAllSubNodes() {
return getDocumentTree().allSubEntriesInOrder(getTreeId()).map(DocumentTree.Entry::getNode); return getDocumentTree().allSubEntriesInOrder(getTreeId())
.map(DocumentTree.Entry::getNode);
} }
@ -602,7 +656,9 @@ public interface SemanticNode {
*/ */
default Stream<SemanticNode> streamAllSubNodesOfType(NodeType nodeType) { default Stream<SemanticNode> streamAllSubNodesOfType(NodeType nodeType) {
return getDocumentTree().allSubEntriesInOrder(getTreeId()).filter(entry -> entry.getType().equals(nodeType)).map(DocumentTree.Entry::getNode); return getDocumentTree().allSubEntriesInOrder(getTreeId())
.filter(entry -> entry.getType().equals(nodeType))
.map(DocumentTree.Entry::getNode);
} }
@ -641,7 +697,8 @@ public interface SemanticNode {
if (isLeaf()) { if (isLeaf()) {
return getTextBlock().getPositionsPerPage(textRange); return getTextBlock().getPositionsPerPage(textRange);
} }
Optional<SemanticNode> containingChildNode = streamChildren().filter(child -> child.getTextRange().contains(textRange)).findFirst(); Optional<SemanticNode> containingChildNode = getDocumentTree().findFirstContainingChild(getTreeId(), textRange);
if (containingChildNode.isEmpty()) { if (containingChildNode.isEmpty()) {
return getTextBlock().getPositionsPerPage(textRange); return getTextBlock().getPositionsPerPage(textRange);
} }
@ -691,8 +748,12 @@ public interface SemanticNode {
private Map<Page, Rectangle2D> getBBoxFromChildren() { private Map<Page, Rectangle2D> getBBoxFromChildren() {
Map<Page, Rectangle2D> bBoxPerPage = new HashMap<>(); Map<Page, Rectangle2D> bBoxPerPage = new HashMap<>();
List<Map<Page, Rectangle2D>> childrenBBoxes = streamChildren().map(SemanticNode::getBBox).toList(); List<Map<Page, Rectangle2D>> childrenBBoxes = streamChildren().map(SemanticNode::getBBox)
Set<Page> pages = childrenBBoxes.stream().flatMap(map -> map.keySet().stream()).collect(Collectors.toSet()); .toList();
Set<Page> pages = childrenBBoxes.stream()
.flatMap(map -> map.keySet()
.stream())
.collect(Collectors.toSet());
for (Page page : pages) { for (Page page : pages) {
Rectangle2D bBoxOnPage = childrenBBoxes.stream() Rectangle2D bBoxOnPage = childrenBBoxes.stream()
.filter(childBboxPerPage -> childBboxPerPage.containsKey(page)) .filter(childBboxPerPage -> childBboxPerPage.containsKey(page))
@ -710,7 +771,9 @@ public interface SemanticNode {
private Map<Page, Rectangle2D> getBBoxFromLeafTextBlock() { private Map<Page, Rectangle2D> getBBoxFromLeafTextBlock() {
Map<Page, Rectangle2D> bBoxPerPage = new HashMap<>(); Map<Page, Rectangle2D> bBoxPerPage = new HashMap<>();
Map<Page, List<AtomicTextBlock>> atomicTextBlockPerPage = getTextBlock().getAtomicTextBlocks().stream().collect(Collectors.groupingBy(AtomicTextBlock::getPage)); Map<Page, List<AtomicTextBlock>> atomicTextBlockPerPage = getTextBlock().getAtomicTextBlocks()
.stream()
.collect(Collectors.groupingBy(AtomicTextBlock::getPage));
atomicTextBlockPerPage.forEach((page, atomicTextBlocks) -> bBoxPerPage.put(page, RectangleTransformations.atomicTextBlockBBox(atomicTextBlocks))); atomicTextBlockPerPage.forEach((page, atomicTextBlocks) -> bBoxPerPage.put(page, RectangleTransformations.atomicTextBlockBBox(atomicTextBlocks)));
return bBoxPerPage; return bBoxPerPage;
} }

View File

@ -9,6 +9,7 @@ import java.util.List;
import java.util.Locale; import java.util.Locale;
import java.util.Map; import java.util.Map;
import java.util.Set; import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.IntStream; import java.util.stream.IntStream;
import java.util.stream.Stream; import java.util.stream.Stream;
@ -64,8 +65,7 @@ public class Table implements SemanticNode {
*/ */
public Stream<TextEntity> streamEntitiesWhereRowContainsStringsIgnoreCase(List<String> strings) { public Stream<TextEntity> streamEntitiesWhereRowContainsStringsIgnoreCase(List<String> strings) {
return IntStream.range(0, numberOfRows) return IntStream.range(0, numberOfRows).boxed()
.boxed()
.filter(row -> rowContainsStringsIgnoreCase(row, strings)) .filter(row -> rowContainsStringsIgnoreCase(row, strings))
.flatMap(this::streamRow) .flatMap(this::streamRow)
.map(TableCell::getEntities) .map(TableCell::getEntities)
@ -82,8 +82,11 @@ public class Table implements SemanticNode {
*/ */
public boolean rowContainsStringsIgnoreCase(Integer row, List<String> strings) { public boolean rowContainsStringsIgnoreCase(Integer row, List<String> strings) {
String rowText = streamRow(row).map(TableCell::getTextBlock).collect(new TextBlockCollector()).getSearchText().toLowerCase(Locale.ROOT); String rowText = streamRow(row).map(TableCell::getTextBlock)
return strings.stream().map(String::toLowerCase).allMatch(rowText::contains); .collect(new TextBlockCollector()).getSearchText().toLowerCase(Locale.ROOT);
return strings.stream()
.map(String::toLowerCase)
.allMatch(rowText::contains);
} }
@ -96,9 +99,13 @@ public class Table implements SemanticNode {
*/ */
public Stream<TextEntity> streamEntitiesWhereRowHasHeaderAndValue(String header, String value) { public Stream<TextEntity> streamEntitiesWhereRowHasHeaderAndValue(String header, String value) {
List<Integer> vertebrateStudyCols = streamHeaders().filter(headerNode -> headerNode.containsString(header)).map(TableCell::getCol).toList(); List<Integer> vertebrateStudyCols = streamHeaders().filter(headerNode -> headerNode.containsString(header))
.map(TableCell::getCol)
.toList();
return streamTableCells().filter(tableCellNode -> vertebrateStudyCols.stream() return streamTableCells().filter(tableCellNode -> vertebrateStudyCols.stream()
.anyMatch(vertebrateStudyCol -> getCell(tableCellNode.getRow(), vertebrateStudyCol).containsString(value))).map(TableCell::getEntities).flatMap(Collection::stream); .anyMatch(vertebrateStudyCol -> getCell(tableCellNode.getRow(), vertebrateStudyCol).containsString(value)))
.map(TableCell::getEntities)
.flatMap(Collection::stream);
} }
@ -111,9 +118,13 @@ public class Table implements SemanticNode {
*/ */
public Stream<TextEntity> streamEntitiesWhereRowHasHeaderAndAnyValue(String header, List<String> values) { public Stream<TextEntity> streamEntitiesWhereRowHasHeaderAndAnyValue(String header, List<String> values) {
List<Integer> colsWithHeader = streamHeaders().filter(headerNode -> headerNode.containsString(header)).map(TableCell::getCol).toList(); List<Integer> colsWithHeader = streamHeaders().filter(headerNode -> headerNode.containsString(header))
.map(TableCell::getCol)
.toList();
return streamTableCells().filter(tableCellNode -> colsWithHeader.stream() return streamTableCells().filter(tableCellNode -> colsWithHeader.stream()
.anyMatch(colWithHeader -> getCell(tableCellNode.getRow(), colWithHeader).containsAnyString(values))).map(TableCell::getEntities).flatMap(Collection::stream); .anyMatch(colWithHeader -> getCell(tableCellNode.getRow(), colWithHeader).containsAnyString(values)))
.map(TableCell::getEntities)
.flatMap(Collection::stream);
} }
@ -126,16 +137,33 @@ public class Table implements SemanticNode {
*/ */
public Stream<TextEntity> streamEntitiesWhereRowContainsEntitiesOfType(List<String> types) { public Stream<TextEntity> streamEntitiesWhereRowContainsEntitiesOfType(List<String> types) {
List<Integer> rowsWithEntityOfType = getEntities().stream() return IntStream.range(0, numberOfRows).boxed()
.filter(TextEntity::active) .filter(rowNumber -> streamTextEntitiesInRow(rowNumber).map(TextEntity::type)
.filter(redactionEntity -> types.stream().anyMatch(type -> type.equals(redactionEntity.type()))) .anyMatch(types::contains))
.map(TextEntity::getIntersectingNodes) .flatMap(this::streamRow)
.filter(node -> node instanceof TableCell) .map(TableCell::getEntities)
.map(node -> (TableCell) node) .flatMap(Collection::stream);
.map(TableCell::getRow) }
.toList();
return rowsWithEntityOfType.stream().flatMap(this::streamRow).map(TableCell::getEntities).flatMap(Collection::stream);
/**
* Streams all entities in this table, that appear in a row, which contains at least one entity of each of the provided types.
* Ignores Entity with ignored == true or removed == true.
*
* @param types type strings to check whether a row contains an entity like them
* @return Stream of all entities in this table, that appear in a row, which contains at least one entity of each of the provided types.
*/
public Stream<TextEntity> streamEntitiesWhereRowContainsEntitiesOfEachType(List<String> types) {
return IntStream.range(0, numberOfRows).boxed()
.filter(rowNumber -> {
Set<String> entityTypes = streamTextEntitiesInRow(rowNumber).map(TextEntity::type)
.collect(Collectors.toSet());
return entityTypes.containsAll(types);
})
.flatMap(this::streamRow)
.map(TableCell::getEntities)
.flatMap(Collection::stream);
} }
@ -148,18 +176,43 @@ public class Table implements SemanticNode {
*/ */
public Stream<TextEntity> streamEntitiesWhereRowContainsNoEntitiesOfType(List<String> types) { public Stream<TextEntity> streamEntitiesWhereRowContainsNoEntitiesOfType(List<String> types) {
return IntStream.range(0, numberOfRows) return IntStream.range(0, numberOfRows).boxed()
.boxed() .filter(rowNumber -> streamTextEntitiesInRow(rowNumber).map(TextEntity::type)
.filter(rowNumber -> streamRow(rowNumber).map(TableCell::getEntities) .noneMatch(types::contains))
.flatMap(Collection::stream)
.filter(TextEntity::active)
.noneMatch(entity -> types.contains(entity.type())))
.flatMap(this::streamRow) .flatMap(this::streamRow)
.map(TableCell::getEntities) .map(TableCell::getEntities)
.flatMap(Collection::stream); .flatMap(Collection::stream);
} }
/**
* Streams all Entities in the given row.
*
* @param rowNumber the row number to look for
* @return stream of TextEntities occurring in row
*/
public Stream<TextEntity> streamTextEntitiesInRow(int rowNumber) {
return streamRow(rowNumber).map(TableCell::getEntities)
.flatMap(Collection::stream)
.filter(TextEntity::active);
}
/**
* Streams all Entities in the given col.
*
* @param colNumber the column number to look for
* @return stream of TextEntities occurring in row
*/
public Stream<TextEntity> streamTextEntitiesInCol(int colNumber) {
return streamCol(colNumber).map(TableCell::getEntities)
.flatMap(Collection::stream)
.filter(TextEntity::active);
}
/** /**
* Returns a TableCell at the provided row and column location. * Returns a TableCell at the provided row and column location.
* *
@ -173,7 +226,8 @@ public class Table implements SemanticNode {
throw new IllegalArgumentException(format("row %d, col %d is out of bounds for number of rows of %d and number of cols %d", row, col, numberOfRows, numberOfCols)); throw new IllegalArgumentException(format("row %d, col %d is out of bounds for number of rows of %d and number of cols %d", row, col, numberOfRows, numberOfCols));
} }
int idx = row * numberOfCols + col; int idx = row * numberOfCols + col;
return (TableCell) documentTree.getEntryById(treeId).getChildren().get(idx).getNode(); return (TableCell) documentTree.getEntryById(treeId).getChildren()
.get(idx).getNode();
} }
@ -196,7 +250,7 @@ public class Table implements SemanticNode {
*/ */
public Stream<TableCell> streamTableCellsWhichContainType(String type) { public Stream<TableCell> streamTableCellsWhichContainType(String type) {
return streamTableCells().filter(tableCell -> tableCell.getEntities().stream().filter(TextEntity::active).anyMatch(entity -> entity.type().equals(type))); return streamTableCells().filter(tableCell -> tableCell.hasEntitiesOfType(type));
} }
@ -222,7 +276,8 @@ public class Table implements SemanticNode {
*/ */
public Stream<TableCell> streamCol(int col) { public Stream<TableCell> streamCol(int col) {
return IntStream.range(0, numberOfRows).boxed().map(row -> getCell(row, col)); return IntStream.range(0, numberOfRows).boxed()
.map(row -> getCell(row, col));
} }
@ -234,7 +289,8 @@ public class Table implements SemanticNode {
*/ */
public Stream<TableCell> streamRow(int row) { public Stream<TableCell> streamRow(int row) {
return IntStream.range(0, numberOfCols).boxed().map(col -> getCell(row, col)); return IntStream.range(0, numberOfCols).boxed()
.map(col -> getCell(row, col));
} }
@ -258,7 +314,8 @@ public class Table implements SemanticNode {
*/ */
public Stream<TableCell> streamHeadersForCell(int row, int col) { public Stream<TableCell> streamHeadersForCell(int row, int col) {
return Stream.concat(streamRow(row), streamCol(col)).filter(TableCell::isHeader); return Stream.concat(streamRow(row), streamCol(col))
.filter(TableCell::isHeader);
} }
@ -348,7 +405,9 @@ public class Table implements SemanticNode {
public TextBlock getTextBlock() { public TextBlock getTextBlock() {
if (textBlock == null) { if (textBlock == null) {
textBlock = streamAllSubNodes().filter(SemanticNode::isLeaf).map(SemanticNode::getLeafTextBlock).collect(new TextBlockCollector()); textBlock = streamAllSubNodes().filter(SemanticNode::isLeaf)
.map(SemanticNode::getLeafTextBlock)
.collect(new TextBlockCollector());
} }
return textBlock; return textBlock;
} }

View File

@ -17,7 +17,7 @@ public class MessageReceiver {
@RabbitHandler @RabbitHandler
@RabbitListener(queues = REDACTION_QUEUE) @RabbitListener(queues = REDACTION_QUEUE, concurrency = "1")
public void receiveAnalyzeRequest(Message message) { public void receiveAnalyzeRequest(Message message) {
redactionMessageReceiver.receiveAnalyzeRequest(message, false); redactionMessageReceiver.receiveAnalyzeRequest(message, false);

View File

@ -17,7 +17,7 @@ public class PriorityMessageReceiver {
@RabbitHandler @RabbitHandler
@RabbitListener(queues = REDACTION_PRIORITY_QUEUE) @RabbitListener(queues = REDACTION_PRIORITY_QUEUE, concurrency = "1")
public void receiveAnalyzeRequest(Message message) { public void receiveAnalyzeRequest(Message message) {
redactionMessageReceiver.receiveAnalyzeRequest(message, true); redactionMessageReceiver.receiveAnalyzeRequest(message, true);

View File

@ -134,7 +134,7 @@ public class RedactionMessageReceiver {
private void sendAnalysisFailed(AnalyzeRequest analyzeRequest, boolean priority, Exception e) { private void sendAnalysisFailed(AnalyzeRequest analyzeRequest, boolean priority, Exception e) {
log.warn("Failed to process analyze request: {}", analyzeRequest, e); log.error("Failed to process analyze request: {}", analyzeRequest, e);
var timestamp = OffsetDateTime.now().truncatedTo(ChronoUnit.MILLIS); var timestamp = OffsetDateTime.now().truncatedTo(ChronoUnit.MILLIS);
fileStatusProcessingUpdateClient.analysisFailed(analyzeRequest.getDossierId(), fileStatusProcessingUpdateClient.analysisFailed(analyzeRequest.getDossierId(),
analyzeRequest.getFileId(), analyzeRequest.getFileId(),

View File

@ -1,7 +1,6 @@
package com.iqser.red.service.redaction.v1.server.service; package com.iqser.red.service.redaction.v1.server.service;
import java.time.OffsetDateTime; import java.time.OffsetDateTime;
import java.util.Comparator;
import java.util.List; import java.util.List;
import java.util.Optional; import java.util.Optional;
import java.util.Set; import java.util.Set;
@ -13,9 +12,7 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import io.micrometer.core.annotation.Timed; import io.micrometer.core.annotation.Timed;
import lombok.AccessLevel; import lombok.AccessLevel;
@ -26,10 +23,9 @@ import lombok.extern.slf4j.Slf4j;
@Slf4j @Slf4j
@Service @Service
@RequiredArgsConstructor @RequiredArgsConstructor
@FieldDefaults(makeFinal=true, level= AccessLevel.PRIVATE) @FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE)
public class EntityChangeLogService { public class EntityChangeLogService {
@Timed("redactmanager_computeChanges") @Timed("redactmanager_computeChanges")
public boolean computeChanges(List<EntityLogEntry> previousEntityLogEntries, List<EntityLogEntry> newEntityLogEntries, ManualRedactions manualRedactions, int analysisNumber) { public boolean computeChanges(List<EntityLogEntry> previousEntityLogEntries, List<EntityLogEntry> newEntityLogEntries, ManualRedactions manualRedactions, int analysisNumber) {
@ -42,7 +38,11 @@ public class EntityChangeLogService {
boolean hasChanges = false; boolean hasChanges = false;
for (EntityLogEntry entityLogEntry : newEntityLogEntries) { for (EntityLogEntry entityLogEntry : newEntityLogEntries) {
Optional<EntityLogEntry> optionalPreviousEntity = previousEntityLogEntries.stream().filter(entry -> entry.getId().equals(entityLogEntry.getId())).findAny();
Optional<EntityLogEntry> optionalPreviousEntity = previousEntityLogEntries.stream()
.filter(entry -> entry.getId().equals(entityLogEntry.getId()))
.findAny();
if (optionalPreviousEntity.isEmpty()) { if (optionalPreviousEntity.isEmpty()) {
hasChanges = true; hasChanges = true;
entityLogEntry.getChanges().add(new Change(analysisNumber, ChangeType.ADDED, now)); entityLogEntry.getChanges().add(new Change(analysisNumber, ChangeType.ADDED, now));
@ -50,71 +50,45 @@ public class EntityChangeLogService {
} }
EntityLogEntry previousEntity = optionalPreviousEntity.get(); EntityLogEntry previousEntity = optionalPreviousEntity.get();
entityLogEntry.getChanges().addAll(previousEntity.getChanges()); entityLogEntry.getChanges().addAll(previousEntity.getChanges());
if (!previousEntity.getState().equals(entityLogEntry.getState())) { if (!previousEntity.getState().equals(entityLogEntry.getState())) {
hasChanges = true; hasChanges = true;
ChangeType changeType = calculateChangeType(entityLogEntry.getState(), previousEntity.getState()); ChangeType changeType = calculateChangeType(entityLogEntry.getState(), previousEntity.getState());
entityLogEntry.getChanges().add(new Change(analysisNumber, changeType, now)); entityLogEntry.getChanges().add(new Change(analysisNumber, changeType, now));
} }
addManualChanges(entityLogEntry, previousEntity);
} }
addRemovedEntriesAsRemoved(previousEntityLogEntries, newEntityLogEntries, manualRedactions, analysisNumber, now); addRemovedEntriesAsRemoved(previousEntityLogEntries, newEntityLogEntries, manualRedactions, analysisNumber, now);
return hasChanges; return hasChanges;
} }
// If a manual change is present in the previous entity but not in the new entity, add it to the new one and
// sort them, so they are displayed in the correct order.
private void addManualChanges(EntityLogEntry entityLogEntry, EntityLogEntry previousEntity) {
Comparator<ManualChange> manualChangeComparator =
Comparator.comparing(ManualChange::getManualRedactionType)
.thenComparing(ManualChange::getRequestedDate);
previousEntity.getManualChanges().forEach(manualChange -> {
boolean contains = entityLogEntry.getManualChanges()
.stream()
.anyMatch(existingChange -> manualChangeComparator.compare(existingChange, manualChange) == 0);
if (!contains) {
entityLogEntry.getManualChanges().add(manualChange);
entityLogEntry.getManualChanges().sort(Comparator.comparing(ManualChange::getRequestedDate));
}
});
}
private void addRemovedEntriesAsRemoved(List<EntityLogEntry> previousEntityLogEntries, private void addRemovedEntriesAsRemoved(List<EntityLogEntry> previousEntityLogEntries,
List<EntityLogEntry> newEntityLogEntries, List<EntityLogEntry> newEntityLogEntries,
ManualRedactions manualRedactions, ManualRedactions manualRedactions,
int analysisNumber, int analysisNumber,
OffsetDateTime now) { OffsetDateTime now) {
Set<String> existingIds = newEntityLogEntries.stream()
.map(EntityLogEntry::getId)
.collect(Collectors.toSet());
Set<String> existingIds = newEntityLogEntries.stream().map(EntityLogEntry::getId).collect(Collectors.toSet());
List<EntityLogEntry> removedEntries = previousEntityLogEntries.stream() List<EntityLogEntry> removedEntries = previousEntityLogEntries.stream()
.filter(entry -> !existingIds.contains(entry.getId())) .filter(entry -> !existingIds.contains(entry.getId()))
.toList(); .toList();
removedEntries.forEach(entry -> entry.getChanges().add(new Change(analysisNumber, ChangeType.REMOVED, now)));
removedEntries.forEach(entry -> entry.setState(EntryState.REMOVED)); removedEntries.stream()
removedEntries.forEach(entry -> addManualChangeForDictionaryRemovals(entry, manualRedactions)); .filter(entry -> !entry.getState().equals(EntryState.REMOVED))
.peek(entry -> entry.getChanges().add(new Change(analysisNumber, ChangeType.REMOVED, now)))
.forEach(entry -> entry.setState(EntryState.REMOVED));
newEntityLogEntries.addAll(removedEntries); newEntityLogEntries.addAll(removedEntries);
} }
private void addManualChangeForDictionaryRemovals(EntityLogEntry entry, ManualRedactions manualRedactions) {
if (manualRedactions == null || manualRedactions.getIdsToRemove().isEmpty()) {
return;
}
manualRedactions.getIdsToRemove().stream()
.filter(IdRemoval::isRemoveFromDictionary)//
.filter(removed -> removed.getAnnotationId().equals(entry.getId()))//
.findFirst()//
.ifPresent(idRemove -> entry.getManualChanges().add(ManualChangeFactory.toManualChange(idRemove, false)));
}
private ChangeType calculateChangeType(EntryState state, EntryState previousState) { private ChangeType calculateChangeType(EntryState state, EntryState previousState) {
if (state.equals(previousState)) { if (state.equals(previousState)) {

View File

@ -11,6 +11,7 @@ import java.util.stream.Collectors;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest; import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogChanges; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogChanges;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
@ -18,7 +19,7 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryType; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualChangeFactory;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.legalbasis.LegalBasis; import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.legalbasis.LegalBasis;
import com.iqser.red.service.redaction.v1.server.RedactionServiceSettings; import com.iqser.red.service.redaction.v1.server.RedactionServiceSettings;
import com.iqser.red.service.redaction.v1.server.client.LegalBasisClient; import com.iqser.red.service.redaction.v1.server.client.LegalBasisClient;
@ -26,11 +27,13 @@ import com.iqser.red.service.redaction.v1.server.model.PrecursorEntity;
import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryVersion; import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryVersion;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType; import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity;
import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualChangeOverwrite;
import com.iqser.red.service.redaction.v1.server.model.document.entity.PositionOnPage; import com.iqser.red.service.redaction.v1.server.model.document.entity.PositionOnPage;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType; import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService; import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService;
import lombok.AccessLevel; import lombok.AccessLevel;
@ -66,20 +69,19 @@ public class EntityLogCreatorService {
List<EntityLogEntry> entityLogEntries = createEntityLogEntries(document, analyzeRequest, notFoundEntities); List<EntityLogEntry> entityLogEntries = createEntityLogEntries(document, analyzeRequest, notFoundEntities);
List<LegalBasis> legalBasis = legalBasisClient.getLegalBasisMapping(analyzeRequest.getDossierTemplateId()); List<LegalBasis> legalBasis = legalBasisClient.getLegalBasisMapping(analyzeRequest.getDossierTemplateId());
EntityLog entityLog = new EntityLog(redactionServiceSettings.getAnalysisVersion(),
analyzeRequest.getAnalysisNumber(),
entityLogEntries,
toEntityLogLegalBasis(legalBasis),
dictionaryVersion.getDossierTemplateVersion(),
dictionaryVersion.getDossierVersion(),
rulesVersion,
legalBasisClient.getVersion(analyzeRequest.getDossierTemplateId()));
List<EntityLogEntry> previousExistingEntityLogEntries = getPreviousEntityLogEntries(analyzeRequest.getDossierId(), analyzeRequest.getFileId()); List<EntityLogEntry> previousExistingEntityLogEntries = getPreviousEntityLogEntries(analyzeRequest.getDossierId(), analyzeRequest.getFileId());
entityChangeLogService.computeChanges(previousExistingEntityLogEntries, entityLogEntries, analyzeRequest.getManualRedactions(), analyzeRequest.getAnalysisNumber()); entityChangeLogService.computeChanges(previousExistingEntityLogEntries, entityLogEntries, analyzeRequest.getManualRedactions(), analyzeRequest.getAnalysisNumber());
return entityLog; return new EntityLog(redactionServiceSettings.getAnalysisVersion(),
analyzeRequest.getAnalysisNumber(),
entityLogEntries,
toEntityLogLegalBasis(legalBasis),
dictionaryVersion.getDossierTemplateVersion(),
dictionaryVersion.getDossierVersion(),
rulesVersion,
legalBasisClient.getVersion(analyzeRequest.getDossierTemplateId()));
} }
@ -114,21 +116,24 @@ public class EntityLogCreatorService {
DictionaryVersion dictionaryVersion) { DictionaryVersion dictionaryVersion) {
List<EntityLogEntry> newEntityLogEntries = createEntityLogEntries(document, analyzeRequest, notFoundEntries).stream() List<EntityLogEntry> newEntityLogEntries = createEntityLogEntries(document, analyzeRequest, notFoundEntries).stream()
.filter(entry -> entry.getContainingNodeId().isEmpty() || sectionsToReanalyseIds.contains(entry.getContainingNodeId().get(0))) .filter(entry -> entry.getContainingNodeId().isEmpty() || sectionsToReanalyseIds.contains(entry.getContainingNodeId()
.get(0)))
.collect(Collectors.toList()); .collect(Collectors.toList());
Set<String> newEntityIds = newEntityLogEntries.stream().map(EntityLogEntry::getId).collect(Collectors.toSet()); Set<String> newEntityIds = newEntityLogEntries.stream()
.map(EntityLogEntry::getId)
.collect(Collectors.toSet());
List<EntityLogEntry> previousEntriesFromReAnalyzedSections = previousEntityLog.getEntityLogEntry() List<EntityLogEntry> previousEntriesFromReAnalyzedSections = previousEntityLog.getEntityLogEntry()
.stream() .stream()
.filter(entry -> (newEntityIds.contains(entry.getId()) || entry.getContainingNodeId().isEmpty() || sectionsToReanalyseIds.contains(entry.getContainingNodeId() .filter(entry -> (newEntityIds.contains(entry.getId()) || entry.getContainingNodeId().isEmpty() || sectionsToReanalyseIds.contains(entry.getContainingNodeId()
.get(0)))) .get(0))))
.toList(); .collect(Collectors.toList());
previousEntityLog.getEntityLogEntry().removeAll(previousEntriesFromReAnalyzedSections); previousEntityLog.getEntityLogEntry().removeAll(previousEntriesFromReAnalyzedSections);
boolean hasChanges = entityChangeLogService.computeChanges(previousEntriesFromReAnalyzedSections, boolean hasChanges = entityChangeLogService.computeChanges(previousEntriesFromReAnalyzedSections,
newEntityLogEntries, newEntityLogEntries,
analyzeRequest.getManualRedactions(), analyzeRequest.getManualRedactions(),
analyzeRequest.getAnalysisNumber()); analyzeRequest.getAnalysisNumber());
previousEntityLog.getEntityLogEntry().addAll(newEntityLogEntries); previousEntityLog.getEntityLogEntry().addAll(newEntityLogEntries);
return updateVersionsAndReturnChanges(previousEntityLog, dictionaryVersion, analyzeRequest, hasChanges); return updateVersionsAndReturnChanges(previousEntityLog, dictionaryVersion, analyzeRequest, hasChanges);
@ -137,22 +142,6 @@ public class EntityLogCreatorService {
private List<EntityLogEntry> createEntityLogEntries(Document document, AnalyzeRequest analyzeRequest, List<PrecursorEntity> notFoundPrecursorEntries) { private List<EntityLogEntry> createEntityLogEntries(Document document, AnalyzeRequest analyzeRequest, List<PrecursorEntity> notFoundPrecursorEntries) {
Set<ManualRedactionEntry> dictionaryEntries;
Set<String> dictionaryEntriesValues;
if (analyzeRequest.getManualRedactions() != null && !analyzeRequest.getManualRedactions().getEntriesToAdd().isEmpty()) {
dictionaryEntries = analyzeRequest.getManualRedactions().getEntriesToAdd()
.stream()
.filter(e -> e.isAddToDictionary() || e.isAddToDossierDictionary())
.collect(Collectors.toSet());
dictionaryEntriesValues = dictionaryEntries.stream()
.map(ManualRedactionEntry::getValue)
.collect(Collectors.toSet());
} else {
dictionaryEntriesValues = new HashSet<>();
dictionaryEntries = new HashSet<>();
}
String dossierTemplateId = analyzeRequest.getDossierTemplateId(); String dossierTemplateId = analyzeRequest.getDossierTemplateId();
List<EntityLogEntry> entries = new ArrayList<>(); List<EntityLogEntry> entries = new ArrayList<>();
@ -162,22 +151,21 @@ public class EntityLogCreatorService {
.filter(entity -> !entity.getValue().isEmpty()) .filter(entity -> !entity.getValue().isEmpty())
.filter(EntityLogCreatorService::notFalsePositiveOrFalseRecommendation) .filter(EntityLogCreatorService::notFalsePositiveOrFalseRecommendation)
.filter(entity -> !entity.removed()) .filter(entity -> !entity.removed())
.forEach(entityNode -> entries.addAll(toEntityLogEntries(entityNode, dictionaryEntries, dictionaryEntriesValues))); .forEach(entityNode -> entries.addAll(toEntityLogEntries(entityNode)));
document.streamAllImages().filter(entity -> !entity.removed()).forEach(imageNode -> entries.add(createEntityLogEntry(imageNode, dossierTemplateId))); document.streamAllImages()
notFoundPrecursorEntries.stream().filter(entity -> !entity.removed()).forEach(precursorEntity -> entries.add(createEntityLogEntry(precursorEntity, dossierTemplateId))); .filter(entity -> !entity.removed())
.forEach(imageNode -> entries.add(createEntityLogEntry(imageNode, dossierTemplateId)));
notFoundPrecursorEntries.stream()
.filter(entity -> !entity.removed())
.forEach(precursorEntity -> entries.add(createEntityLogEntry(precursorEntity, dossierTemplateId)));
return entries; return entries;
} }
private List<EntityLogEntry> toEntityLogEntries(TextEntity textEntity, Set<ManualRedactionEntry> dictionaryEntries, Set<String> dictionaryEntriesValues) { private List<EntityLogEntry> toEntityLogEntries(TextEntity textEntity) {
List<EntityLogEntry> entityLogEntries = new ArrayList<>(); List<EntityLogEntry> entityLogEntries = new ArrayList<>();
// Adding ADD_TO_DICTIONARY manual change to the entity's manual overwrite
if (dictionaryEntriesValues.contains(textEntity.getValue())) {
textEntity.getManualOverwrite().addChange(dictionaryEntries.stream().filter(entry -> entry.getValue().equals(textEntity.getValue())).findFirst().get());
}
// split entity into multiple entries if it occurs on multiple pages, since FE can't handle multi page entities // split entity into multiple entries if it occurs on multiple pages, since FE can't handle multi page entities
for (PositionOnPage positionOnPage : textEntity.getPositionsOnPagePerPage()) { for (PositionOnPage positionOnPage : textEntity.getPositionsOnPagePerPage()) {
@ -204,7 +192,7 @@ public class EntityLogCreatorService {
boolean isHint = dictionaryService.isHint(imageType, dossierTemplateId); boolean isHint = dictionaryService.isHint(imageType, dossierTemplateId);
return EntityLogEntry.builder() return EntityLogEntry.builder()
.id(image.getId()) .id(image.getId())
.value(image.value()) .value(image.getValue())
.type(imageType) .type(imageType)
.reason(image.buildReasonWithManualChangeDescriptions()) .reason(image.buildReasonWithManualChangeDescriptions())
.legalBasis(image.legalBasis()) .legalBasis(image.legalBasis())
@ -213,11 +201,13 @@ public class EntityLogCreatorService {
.positions(List.of(new Position(image.getPosition(), image.getPage().getNumber()))) .positions(List.of(new Position(image.getPosition(), image.getPage().getNumber())))
.containingNodeId(image.getTreeId()) .containingNodeId(image.getTreeId())
.closestHeadline(image.getHeadline().getTextBlock().getSearchText()) .closestHeadline(image.getHeadline().getTextBlock().getSearchText())
.section(image.getManualOverwrite().getSection().orElse(image.getParent().toString())) .section(image.getManualOverwrite().getSection()
.orElse(this.buildSectionString(image.getParent())))
.imageHasTransparency(image.isTransparent()) .imageHasTransparency(image.isTransparent())
.manualChanges(ManualChangeFactory.toManualChangeList(image.getManualOverwrite().getManualChangeLog(), isHint)) .manualChanges(ManualChangeFactory.toLocalManualChangeList(image.getManualOverwrite().getManualChangeLog(), true))
.state(buildEntryState(image)) .state(buildEntryState(image))
.entryType(isHint ? EntryType.IMAGE_HINT : EntryType.IMAGE) .entryType(isHint ? EntryType.IMAGE_HINT : EntryType.IMAGE)
.engines(getEngines(null, image.getManualOverwrite()))
.build(); .build();
} }
@ -225,7 +215,8 @@ public class EntityLogCreatorService {
private EntityLogEntry createEntityLogEntry(PrecursorEntity precursorEntity, String dossierTemplateId) { private EntityLogEntry createEntityLogEntry(PrecursorEntity precursorEntity, String dossierTemplateId) {
String type = precursorEntity.getManualOverwrite().getType().orElse(precursorEntity.getType()); String type = precursorEntity.getManualOverwrite().getType()
.orElse(precursorEntity.getType());
boolean isHint = isHint(precursorEntity.getEntityType()); boolean isHint = isHint(precursorEntity.getEntityType());
return EntityLogEntry.builder() return EntityLogEntry.builder()
.id(precursorEntity.getId()) .id(precursorEntity.getId())
@ -235,7 +226,8 @@ public class EntityLogCreatorService {
.type(type) .type(type)
.state(buildEntryState(precursorEntity)) .state(buildEntryState(precursorEntity))
.entryType(buildEntryType(precursorEntity)) .entryType(buildEntryType(precursorEntity))
.section(precursorEntity.getManualOverwrite().getSection().orElse(precursorEntity.getSection())) .section(precursorEntity.getManualOverwrite().getSection()
.orElse(precursorEntity.getSection()))
.containingNodeId(Collections.emptyList()) .containingNodeId(Collections.emptyList())
.closestHeadline("") .closestHeadline("")
.matchedRule(precursorEntity.getMatchedRule().getRuleIdentifier().toString()) .matchedRule(precursorEntity.getMatchedRule().getRuleIdentifier().toString())
@ -245,18 +237,17 @@ public class EntityLogCreatorService {
.textBefore("") .textBefore("")
.startOffset(-1) .startOffset(-1)
.endOffset(-1) .endOffset(-1)
.positions(precursorEntity.getManualOverwrite() .positions(precursorEntity.getManualOverwrite().getPositions()
.getPositions() .orElse(precursorEntity.getEntityPosition())
.orElse(precursorEntity.getEntityPosition()) .stream()
.stream() .map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber()))
.map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber())) .toList())
.toList()) .engines(getEngines(precursorEntity.getEngines(), precursorEntity.getManualOverwrite()))
.engines(precursorEntity.getEngines())
//imported is no longer used, frontend should check engines //imported is no longer used, frontend should check engines
//(was .imported(precursorEntity.getEngines() != null && precursorEntity.getEngines().contains(Engine.IMPORTED))) //(was .imported(precursorEntity.getEngines() != null && precursorEntity.getEngines().contains(Engine.IMPORTED)))
.imported(false) .imported(false)
.reference(Collections.emptySet()) .reference(Collections.emptySet())
.manualChanges(ManualChangeFactory.toManualChangeList(precursorEntity.getManualOverwrite().getManualChangeLog(), isHint)) .manualChanges(ManualChangeFactory.toLocalManualChangeList(precursorEntity.getManualOverwrite().getManualChangeLog(), true))
.build(); .build();
} }
@ -264,14 +255,20 @@ public class EntityLogCreatorService {
private EntityLogEntry createEntityLogEntry(TextEntity entity) { private EntityLogEntry createEntityLogEntry(TextEntity entity) {
Set<String> referenceIds = new HashSet<>(); Set<String> referenceIds = new HashSet<>();
entity.references().stream().filter(TextEntity::active).forEach(ref -> ref.getPositionsOnPagePerPage().forEach(pos -> referenceIds.add(pos.getId()))); entity.references()
.stream()
.filter(TextEntity::active)
.forEach(ref -> ref.getPositionsOnPagePerPage()
.forEach(pos -> referenceIds.add(pos.getId())));
boolean isHint = isHint(entity.getEntityType()); boolean isHint = isHint(entity.getEntityType());
return EntityLogEntry.builder() return EntityLogEntry.builder()
.reason(entity.buildReasonWithManualChangeDescriptions()) .reason(entity.buildReasonWithManualChangeDescriptions())
.legalBasis(entity.legalBasis()) .legalBasis(entity.legalBasis())
.value(entity.getManualOverwrite().getValue().orElse(entity.getMatchedRule().isWriteValueWithLineBreaks() ? entity.getValueWithLineBreaks() : entity.getValue())) .value(entity.getManualOverwrite().getValue()
.orElse(entity.getMatchedRule().isWriteValueWithLineBreaks() ? entity.getValueWithLineBreaks() : entity.getValue()))
.type(entity.type()) .type(entity.type())
.section(entity.getManualOverwrite().getSection().orElse(entity.getDeepestFullyContainingNode().toString())) .section(entity.getManualOverwrite().getSection()
.orElse(this.buildSectionString(entity.getDeepestFullyContainingNode())))
.containingNodeId(entity.getDeepestFullyContainingNode().getTreeId()) .containingNodeId(entity.getDeepestFullyContainingNode().getTreeId())
.closestHeadline(entity.getDeepestFullyContainingNode().getHeadline().getTextBlock().getSearchText()) .closestHeadline(entity.getDeepestFullyContainingNode().getHeadline().getTextBlock().getSearchText())
.matchedRule(entity.getMatchedRule().getRuleIdentifier().toString()) .matchedRule(entity.getMatchedRule().getRuleIdentifier().toString())
@ -281,25 +278,36 @@ public class EntityLogCreatorService {
.startOffset(entity.getTextRange().start()) .startOffset(entity.getTextRange().start())
.endOffset(entity.getTextRange().end()) .endOffset(entity.getTextRange().end())
.dossierDictionaryEntry(entity.isDossierDictionaryEntry()) .dossierDictionaryEntry(entity.isDossierDictionaryEntry())
.engines(entity.getEngines() != null ? entity.getEngines() : Collections.emptySet()) .engines(getEngines(entity.getEngines(), entity.getManualOverwrite()))
//imported is no longer used, frontend should check engines //imported is no longer used, frontend should check engines
//(was .imported(entity.getEngines() != null && entity.getEngines().contains(Engine.IMPORTED))) //(was .imported(entity.getEngines() != null && entity.getEngines().contains(Engine.IMPORTED)))
.imported(false) .imported(false)
.reference(referenceIds) .reference(referenceIds)
.manualChanges(ManualChangeFactory.toManualChangeList(entity.getManualOverwrite().getManualChangeLog(), isHint)) .manualChanges(ManualChangeFactory.toLocalManualChangeList(entity.getManualOverwrite().getManualChangeLog(), true))
.state(buildEntryState(entity)) .state(buildEntryState(entity))
.entryType(buildEntryType(entity)) .entryType(buildEntryType(entity))
.build(); .build();
} }
private Set<Engine> getEngines(Set<Engine> currentEngines, ManualChangeOverwrite manualChangeOverwrite) {
Set<Engine> engines = currentEngines != null ? new HashSet<>(currentEngines) : new HashSet<>();
if (manualChangeOverwrite != null && !manualChangeOverwrite.getManualChangeLog().isEmpty()) {
engines.add(Engine.MANUAL);
}
return engines;
}
private boolean isHint(EntityType entityType) { private boolean isHint(EntityType entityType) {
return entityType.equals(EntityType.HINT); return entityType.equals(EntityType.HINT);
} }
private EntryState buildEntryState(IEntity entity) { public static EntryState buildEntryState(IEntity entity) {
if (entity.applied() && entity.active()) { if (entity.applied() && entity.active()) {
return EntryState.APPLIED; return EntryState.APPLIED;
@ -313,12 +321,17 @@ public class EntityLogCreatorService {
} }
private EntryType buildEntryType(IEntity entity) { public static EntryType buildEntryType(IEntity entity) {
if (entity instanceof TextEntity textEntity) { if (entity instanceof TextEntity textEntity) {
return getEntryType(textEntity.getEntityType()); return getEntryType(textEntity.getEntityType());
} else if (entity instanceof PrecursorEntity precursorEntity) { } else if (entity instanceof PrecursorEntity precursorEntity) {
return precursorEntity.getEntryType(); if (precursorEntity.isRectangle()) {
return EntryType.AREA;
}
return getEntryType(precursorEntity.getEntityType());
} else if (entity instanceof Image) {
return EntryType.IMAGE;
} }
throw new UnsupportedOperationException(String.format("Entity subclass %s is not implemented!", entity.getClass())); throw new UnsupportedOperationException(String.format("Entity subclass %s is not implemented!", entity.getClass()));
} }
@ -338,7 +351,15 @@ public class EntityLogCreatorService {
private List<EntityLogLegalBasis> toEntityLogLegalBasis(List<LegalBasis> legalBasis) { private List<EntityLogLegalBasis> toEntityLogLegalBasis(List<LegalBasis> legalBasis) {
return legalBasis.stream().map(l -> new EntityLogLegalBasis(l.getName(), l.getDescription(), l.getReason())).collect(Collectors.toList()); return legalBasis.stream()
.map(l -> new EntityLogLegalBasis(l.getName(), l.getDescription(), l.getReason()))
.collect(Collectors.toList());
}
private String buildSectionString(SemanticNode node) {
return node.getType().toString() + ": " + node.getTextBlock().buildSummary();
} }
} }

View File

@ -1,54 +0,0 @@
package com.iqser.red.service.redaction.v1.server.service;
import java.time.OffsetDateTime;
import java.util.List;
import java.util.stream.Collectors;
import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualLegalBasisChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType;
import lombok.experimental.UtilityClass;
@UtilityClass
public class ManualChangeFactory {
public List<ManualChange> toManualChangeList(List<BaseAnnotation> manualChanges, boolean isHint) {
return manualChanges.stream().map(baseAnnotation -> toManualChange(baseAnnotation, isHint)).collect(Collectors.toList());
}
public ManualChange toManualChange(BaseAnnotation baseAnnotation, boolean isHint) {
ManualChange manualChange = ManualChange.from(baseAnnotation);
if (baseAnnotation instanceof ManualRecategorization imageRecategorization) {
manualChange.withManualRedactionType(ManualRedactionType.RECATEGORIZE).withChange("type", imageRecategorization.getType());
} else if (baseAnnotation instanceof IdRemoval manualRemoval) {
manualChange.withManualRedactionType(manualRemoval.isRemoveFromDictionary() ? ManualRedactionType.REMOVE_FROM_DICTIONARY : ManualRedactionType.REMOVE_LOCALLY);
} else if (baseAnnotation instanceof ManualForceRedaction) {
manualChange.withManualRedactionType(isHint ? ManualRedactionType.FORCE_HINT : ManualRedactionType.FORCE_REDACT);
} else if (baseAnnotation instanceof ManualResizeRedaction manualResizeRedact) {
manualChange.withManualRedactionType(manualResizeRedact.getUpdateDictionary() ? ManualRedactionType.RESIZE_IN_DICTIONARY : ManualRedactionType.RESIZE).withChange("value", manualResizeRedact.getValue());
} else if (baseAnnotation instanceof ManualRedactionEntry manualRedactionEntry) {
manualChange.withManualRedactionType(manualRedactionEntry.isAddToDictionary() ? ManualRedactionType.ADD_TO_DICTIONARY : ManualRedactionType.ADD_LOCALLY)
.withChange("value", manualRedactionEntry.getValue());
} else if (baseAnnotation instanceof ManualLegalBasisChange manualLegalBasisChange) {
manualChange.withManualRedactionType(ManualRedactionType.LEGAL_BASIS_CHANGE)
.withChange("section", manualLegalBasisChange.getSection())
.withChange("value", manualLegalBasisChange.getValue())
.withChange("legalBasis", manualLegalBasisChange.getLegalBasis());
}
manualChange.setProcessedDate(OffsetDateTime.now());
return manualChange;
}
}

View File

@ -47,6 +47,10 @@ public class ManualChangesApplicationService {
entityToBeReCategorized.getMatchedRuleList().clear(); entityToBeReCategorized.getMatchedRuleList().clear();
entityToBeReCategorized.getManualOverwrite().addChange(manualRecategorization); entityToBeReCategorized.getManualOverwrite().addChange(manualRecategorization);
if (manualRecategorization.getType() == null) {
return;
}
if (entityToBeReCategorized instanceof Image image) { if (entityToBeReCategorized instanceof Image image) {
image.setImageType(ImageType.fromString(manualRecategorization.getType())); image.setImageType(ImageType.fromString(manualRecategorization.getType()));
return; return;
@ -74,9 +78,9 @@ public class ManualChangesApplicationService {
.orElseThrow(() -> new NoSuchElementException("No redaction position with matching annotation id found!")); .orElseThrow(() -> new NoSuchElementException("No redaction position with matching annotation id found!"));
positionOnPageToBeResized.setRectanglePerLine(manualResizeRedaction.getPositions() positionOnPageToBeResized.setRectanglePerLine(manualResizeRedaction.getPositions()
.stream() .stream()
.map(ManualChangesApplicationService::toRectangle2D) .map(ManualChangesApplicationService::toRectangle2D)
.collect(Collectors.toList())); .collect(Collectors.toList()));
entityToBeResized.getManualOverwrite().addChange(manualResizeRedaction); entityToBeResized.getManualOverwrite().addChange(manualResizeRedaction);
@ -90,11 +94,17 @@ public class ManualChangesApplicationService {
if (closestEntity.isPresent()) { if (closestEntity.isPresent()) {
copyValuesFromClosestEntity(entityToBeResized, manualResizeRedaction, closestEntity.get()); copyValuesFromClosestEntity(entityToBeResized, manualResizeRedaction, closestEntity.get());
possibleEntities.values().stream().flatMap(Collection::stream).forEach(TextEntity::removeFromGraph); possibleEntities.values()
.stream()
.flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph);
return; return;
} }
possibleEntities.values().stream().flatMap(Collection::stream).forEach(TextEntity::removeFromGraph); possibleEntities.values()
.stream()
.flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph);
if (node.hasParent()) { if (node.hasParent()) {
node = node.getParent(); node = node.getParent();
@ -110,14 +120,18 @@ public class ManualChangesApplicationService {
Set<SemanticNode> currentIntersectingNodes = new HashSet<>(entityToBeResized.getIntersectingNodes()); Set<SemanticNode> currentIntersectingNodes = new HashSet<>(entityToBeResized.getIntersectingNodes());
Set<SemanticNode> newIntersectingNodes = new HashSet<>(closestEntity.getIntersectingNodes()); Set<SemanticNode> newIntersectingNodes = new HashSet<>(closestEntity.getIntersectingNodes());
Sets.difference(currentIntersectingNodes, newIntersectingNodes).forEach(removedNode -> removedNode.getEntities().remove(entityToBeResized)); Sets.difference(currentIntersectingNodes, newIntersectingNodes)
Sets.difference(newIntersectingNodes, currentIntersectingNodes).forEach(addedNode -> addedNode.getEntities().add(entityToBeResized)); .forEach(removedNode -> removedNode.getEntities().remove(entityToBeResized));
Sets.difference(newIntersectingNodes, currentIntersectingNodes)
.forEach(addedNode -> addedNode.getEntities().add(entityToBeResized));
Set<Page> currentIntersectingPages = new HashSet<>(entityToBeResized.getPages()); Set<Page> currentIntersectingPages = new HashSet<>(entityToBeResized.getPages());
Set<Page> newIntersectingPages = new HashSet<>(closestEntity.getPages()); Set<Page> newIntersectingPages = new HashSet<>(closestEntity.getPages());
Sets.difference(currentIntersectingPages, newIntersectingPages).forEach(removedPage -> removedPage.getEntities().remove(entityToBeResized)); Sets.difference(currentIntersectingPages, newIntersectingPages)
Sets.difference(newIntersectingPages, currentIntersectingPages).forEach(addedPage -> addedPage.getEntities().add(entityToBeResized)); .forEach(removedPage -> removedPage.getEntities().remove(entityToBeResized));
Sets.difference(newIntersectingPages, currentIntersectingPages)
.forEach(addedPage -> addedPage.getEntities().add(entityToBeResized));
entityToBeResized.setDeepestFullyContainingNode(closestEntity.getDeepestFullyContainingNode()); entityToBeResized.setDeepestFullyContainingNode(closestEntity.getDeepestFullyContainingNode());
entityToBeResized.setIntersectingNodes(new ArrayList<>(newIntersectingNodes)); entityToBeResized.setIntersectingNodes(new ArrayList<>(newIntersectingNodes));
@ -135,7 +149,10 @@ public class ManualChangesApplicationService {
if (manualResizeRedaction.getPositions().isEmpty() || manualResizeRedaction.getPositions() == null) { if (manualResizeRedaction.getPositions().isEmpty() || manualResizeRedaction.getPositions() == null) {
return; return;
} }
var bBox = RectangleTransformations.rectangle2DBBox(manualResizeRedaction.getPositions().stream().map(ManualChangesApplicationService::toRectangle2D).toList()); var bBox = RectangleTransformations.rectangle2DBBox(manualResizeRedaction.getPositions()
.stream()
.map(ManualChangesApplicationService::toRectangle2D)
.toList());
image.setPosition(bBox); image.setPosition(bBox);
image.getManualOverwrite().addChange(manualResizeRedaction); image.getManualOverwrite().addChange(manualResizeRedaction);
} }

View File

@ -16,6 +16,7 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequ
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.redaction.v1.model.AnalyzeResponse; import com.iqser.red.service.redaction.v1.model.AnalyzeResponse;
@ -25,15 +26,12 @@ import com.iqser.red.service.redaction.v1.server.model.PrecursorEntity;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity; import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.service.document.DocumentGraphMapper; import com.iqser.red.service.redaction.v1.server.service.document.DocumentGraphMapper;
import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityEnrichmentService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility; import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFromPrecursorCreationService; import com.iqser.red.service.redaction.v1.server.service.document.EntityFromPrecursorCreationService;
import com.iqser.red.service.redaction.v1.server.storage.ObservedStorageService; import com.iqser.red.service.redaction.v1.server.storage.ObservedStorageService;
import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService; import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService;
import io.micrometer.observation.annotation.Observed; import io.micrometer.observation.annotation.Observed;
import jakarta.annotation.PostConstruct;
import lombok.AccessLevel; import lombok.AccessLevel;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import lombok.experimental.FieldDefaults; import lombok.experimental.FieldDefaults;
@ -51,20 +49,10 @@ public class UnprocessedChangesService {
final ObservedStorageService observedStorageService; final ObservedStorageService observedStorageService;
final EntityFindingUtility entityFindingUtility; final EntityFindingUtility entityFindingUtility;
final RedactionStorageService redactionStorageService; final RedactionStorageService redactionStorageService;
final EntityEnrichmentService entityEnrichmentService;
final EntityFromPrecursorCreationService entityFromPrecursorCreationService; final EntityFromPrecursorCreationService entityFromPrecursorCreationService;
final DictionaryService dictionaryService; final DictionaryService dictionaryService;
final ManualChangesApplicationService manualChangesApplicationService; final ManualChangesApplicationService manualChangesApplicationService;
EntityCreationService entityCreationService;
@PostConstruct
public void initEntityCreationService() {
entityCreationService = new EntityCreationService(entityEnrichmentService);
}
@Observed(name = "UnprocessedChangesService", contextualName = "analyse-surrounding-text") @Observed(name = "UnprocessedChangesService", contextualName = "analyse-surrounding-text")
public void analyseSurroundingText(AnalyzeRequest analyzeRequest) { public void analyseSurroundingText(AnalyzeRequest analyzeRequest) {
@ -76,11 +64,19 @@ public class UnprocessedChangesService {
EntityLog previousEntityLog = redactionStorageService.getEntityLog(analyzeRequest.getDossierId(), analyzeRequest.getFileId()); EntityLog previousEntityLog = redactionStorageService.getEntityLog(analyzeRequest.getDossierId(), analyzeRequest.getFileId());
Document document = DocumentGraphMapper.toDocumentGraph(observedStorageService.getDocumentData(analyzeRequest.getDossierId(), analyzeRequest.getFileId())); Document document = DocumentGraphMapper.toDocumentGraph(observedStorageService.getDocumentData(analyzeRequest.getDossierId(), analyzeRequest.getFileId()));
Set<String> allAnnotationIds = analyzeRequest.getManualRedactions().getEntriesToAdd().stream().map(ManualRedactionEntry::getAnnotationId).collect(Collectors.toSet()); Set<String> allAnnotationIds = analyzeRequest.getManualRedactions().getEntriesToAdd()
Set<String> resizeIds = analyzeRequest.getManualRedactions().getResizeRedactions().stream().map(ManualResizeRedaction::getAnnotationId).collect(Collectors.toSet()); .stream()
.map(ManualRedactionEntry::getAnnotationId)
.collect(Collectors.toSet());
Set<String> resizeIds = analyzeRequest.getManualRedactions().getResizeRedactions()
.stream()
.map(ManualResizeRedaction::getAnnotationId)
.collect(Collectors.toSet());
allAnnotationIds.addAll(resizeIds); allAnnotationIds.addAll(resizeIds);
List<ManualResizeRedaction> manualResizeRedactions = analyzeRequest.getManualRedactions().getResizeRedactions().stream().toList(); List<ManualResizeRedaction> manualResizeRedactions = analyzeRequest.getManualRedactions().getResizeRedactions()
.stream()
.toList();
List<PrecursorEntity> manualEntitiesToBeResized = previousEntityLog.getEntityLogEntry() List<PrecursorEntity> manualEntitiesToBeResized = previousEntityLog.getEntityLogEntry()
.stream() .stream()
.filter(entityLogEntry -> resizeIds.contains(entityLogEntry.getId())) .filter(entityLogEntry -> resizeIds.contains(entityLogEntry.getId()))
@ -99,31 +95,36 @@ public class UnprocessedChangesService {
notFoundManualEntities = entityFromPrecursorCreationService.toTextEntity(manualEntities, document); notFoundManualEntities = entityFromPrecursorCreationService.toTextEntity(manualEntities, document);
} }
document.getEntities().forEach(textEntity -> { document.getEntities()
Set<String> processedIds = new HashSet<>(); .forEach(textEntity -> {
for (var positionsOnPerPage : textEntity.getPositionsOnPagePerPage()) { Set<String> processedIds = new HashSet<>();
if (processedIds.contains(positionsOnPerPage.getId())) { for (var positionsOnPerPage : textEntity.getPositionsOnPagePerPage()) {
continue; if (processedIds.contains(positionsOnPerPage.getId())) {
} continue;
processedIds.add(positionsOnPerPage.getId()); }
List<Position> positions = positionsOnPerPage.getRectanglePerLine() processedIds.add(positionsOnPerPage.getId());
.stream() List<Position> positions = positionsOnPerPage.getRectanglePerLine()
.map(rectangle2D -> new Position(rectangle2D, positionsOnPerPage.getPage().getNumber())) .stream()
.collect(Collectors.toList()); .map(rectangle2D -> new Position(rectangle2D, positionsOnPerPage.getPage().getNumber()))
unprocessedManualEntities.add(UnprocessedManualEntity.builder() .collect(Collectors.toList());
.annotationId(allAnnotationIds.stream().filter(textEntity::matchesAnnotationId).findFirst().orElse("")) unprocessedManualEntities.add(UnprocessedManualEntity.builder()
.textBefore(textEntity.getTextBefore()) .annotationId(allAnnotationIds.stream()
.textAfter(textEntity.getTextAfter()) .filter(textEntity::matchesAnnotationId)
.section(textEntity.getManualOverwrite().getSection().orElse(textEntity.getDeepestFullyContainingNode().toString())) .findFirst()
.positions(positions) .orElse(""))
.build()); .textBefore(textEntity.getTextBefore())
} .textAfter(textEntity.getTextAfter())
}); .section(textEntity.getManualOverwrite().getSection()
.orElse(textEntity.getDeepestFullyContainingNode().toString()))
.positions(positions)
.build());
}
});
notFoundManualEntities.forEach(manualEntity -> unprocessedManualEntities.add(builDefaultUnprocessedManualEntity(manualEntity))); notFoundManualEntities.forEach(manualEntity -> unprocessedManualEntities.add(builDefaultUnprocessedManualEntity(manualEntity)));
rabbitTemplate.convertAndSend(QueueNames.REDACTION_ANALYSIS_RESPONSE_QUEUE, rabbitTemplate.convertAndSend(QueueNames.REDACTION_ANALYSIS_RESPONSE_QUEUE,
AnalyzeResponse.builder().fileId(analyzeRequest.getFileId()).unprocessedManualEntities(unprocessedManualEntities).build()); AnalyzeResponse.builder().fileId(analyzeRequest.getFileId()).unprocessedManualEntities(unprocessedManualEntities).build());
} }
@ -143,13 +144,13 @@ public class UnprocessedChangesService {
continue; continue;
} }
TextEntity correctEntity = createCorrectEntity(precursorEntity, optionalTextEntity.get()); TextEntity correctEntity = EntityFromPrecursorCreationService.createCorrectEntity(precursorEntity, optionalTextEntity.get());
Optional<ManualResizeRedaction> optionalManualResizeRedaction = manualResizeRedactions.stream() Optional<ManualResizeRedaction> optionalManualResizeRedaction = manualResizeRedactions.stream()
.filter(manualResizeRedaction -> manualResizeRedaction.getAnnotationId().equals(precursorEntity.getId())) .filter(manualResizeRedaction -> manualResizeRedaction.getAnnotationId().equals(precursorEntity.getId()))
.findFirst(); .findFirst();
if (optionalManualResizeRedaction.isPresent()) { if (optionalManualResizeRedaction.isPresent()) {
ManualResizeRedaction manualResizeRedaction = optionalManualResizeRedaction.get(); ManualResizeRedaction manualResizeRedaction = optionalManualResizeRedaction.get();
manualChangesApplicationService.resizeEntityAndReinsert(correctEntity, manualResizeRedaction); manualChangesApplicationService.resize(correctEntity, manualResizeRedaction);
// If the entity's value is not the same as the manual resize request's value it means we didn't find it anywhere and we want to remove it // If the entity's value is not the same as the manual resize request's value it means we didn't find it anywhere and we want to remove it
// from the graph, so it does not get processed and sent back to persistence-service to update its value. // from the graph, so it does not get processed and sent back to persistence-service to update its value.
@ -160,60 +161,38 @@ public class UnprocessedChangesService {
} }
// remove all temp entities from the graph // remove all temp entities from the graph
tempEntities.values().stream().flatMap(Collection::stream).forEach(TextEntity::removeFromGraph); tempEntities.values()
.stream()
.flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph);
} }
private TextEntity createCorrectEntity(PrecursorEntity precursorEntity, TextEntity closestEntity) { private UnprocessedManualEntity builDefaultUnprocessedManualEntity(PrecursorEntity precursorEntity) {
TextEntity correctEntity = TextEntity.initialEntityNode(closestEntity.getTextRange(), precursorEntity.type(), precursorEntity.getEntityType(), precursorEntity.getId()); return UnprocessedManualEntity.builder()
.annotationId(precursorEntity.getId())
correctEntity.setDeepestFullyContainingNode(closestEntity.getDeepestFullyContainingNode()); .textAfter("")
correctEntity.setIntersectingNodes(new ArrayList<>(closestEntity.getIntersectingNodes())); .textBefore("")
correctEntity.setDuplicateTextRanges(new ArrayList<>(closestEntity.getDuplicateTextRanges())); .section("")
correctEntity.setPages(new HashSet<>(closestEntity.getPages())); .positions(precursorEntity.getManualOverwrite().getPositions()
.orElse(precursorEntity.getEntityPosition())
correctEntity.setValue(closestEntity.getValue()); .stream()
correctEntity.setTextAfter(closestEntity.getTextAfter()); .map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber()))
correctEntity.setTextBefore(closestEntity.getTextBefore()); .toList())
.build();
correctEntity.getIntersectingNodes().forEach(n -> n.getEntities().add(correctEntity)); }
correctEntity.getPages().forEach(page -> page.getEntities().add(correctEntity));
correctEntity.addMatchedRules(precursorEntity.getMatchedRuleList());
correctEntity.setDictionaryEntry(precursorEntity.isDictionaryEntry());
correctEntity.setDossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry());
correctEntity.getManualOverwrite().addChanges(precursorEntity.getManualOverwrite().getManualChangeLog());
return correctEntity;
}
private UnprocessedManualEntity builDefaultUnprocessedManualEntity(PrecursorEntity precursorEntity) { private List<PrecursorEntity> manualEntitiesConverter(ManualRedactions manualRedactions, String dossierTemplateId) {
return UnprocessedManualEntity.builder() return manualRedactions.getEntriesToAdd()
.annotationId(precursorEntity.getId()) .stream()
.textAfter("") .filter(manualRedactionEntry -> manualRedactionEntry.getPositions() != null && !manualRedactionEntry.getPositions().isEmpty())
.textBefore("") .filter(BaseAnnotation::isLocal)
.section("") .map(manualRedactionEntry -> PrecursorEntity.fromManualRedactionEntry(manualRedactionEntry,
.positions(precursorEntity.getManualOverwrite() dictionaryService.isHint(manualRedactionEntry.getType(), dossierTemplateId)))
.getPositions() .toList();
.orElse(precursorEntity.getEntityPosition()) }
.stream()
.map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber()))
.toList())
.build();
}
private List<PrecursorEntity> manualEntitiesConverter(ManualRedactions manualRedactions, String dossierTemplateId) {
return manualRedactions.getEntriesToAdd()
.stream()
.filter(manualRedactionEntry -> manualRedactionEntry.getPositions() != null && !manualRedactionEntry.getPositions().isEmpty())
.map(manualRedactionEntry -> PrecursorEntity.fromManualRedactionEntry(manualRedactionEntry,
dictionaryService.isHint(manualRedactionEntry.getType(), dossierTemplateId)))
.toList();
}
} }

View File

@ -1,5 +1,6 @@
package com.iqser.red.service.redaction.v1.server.service.document; package com.iqser.red.service.redaction.v1.server.service.document;
import java.util.ArrayList;
import java.util.Arrays; import java.util.Arrays;
import java.util.HashSet; import java.util.HashSet;
import java.util.LinkedList; import java.util.LinkedList;
@ -7,22 +8,22 @@ import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.NoSuchElementException; import java.util.NoSuchElementException;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentData;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentTree; import com.iqser.red.service.redaction.v1.server.model.document.DocumentTree;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Footer; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Footer;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Header; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Header;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Headline;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Paragraph; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Paragraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Section; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Section;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode; import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell; import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.AtomicTextBlock; import com.iqser.red.service.redaction.v1.server.model.document.textblock.AtomicTextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock; import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlockCollector; import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlockCollector;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentData;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Headline;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPage; import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPage;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPositionData; import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPositionData;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentStructure; import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentStructure;
@ -39,7 +40,9 @@ public class DocumentGraphMapper {
DocumentTree documentTree = new DocumentTree(document); DocumentTree documentTree = new DocumentTree(document);
Context context = new Context(documentData, documentTree); Context context = new Context(documentData, documentTree);
context.pageData.addAll(Arrays.stream(documentData.getDocumentPages()).map(DocumentGraphMapper::buildPage).toList()); context.pageData.addAll(Arrays.stream(documentData.getDocumentPages())
.map(DocumentGraphMapper::buildPage)
.toList());
context.documentTree.getRoot().getChildren().addAll(buildEntries(documentData.getDocumentStructure().getRoot().getChildren(), context)); context.documentTree.getRoot().getChildren().addAll(buildEntries(documentData.getDocumentStructure().getRoot().getChildren(), context));
@ -54,10 +57,12 @@ public class DocumentGraphMapper {
private List<DocumentTree.Entry> buildEntries(List<DocumentStructure.EntryData> entries, Context context) { private List<DocumentTree.Entry> buildEntries(List<DocumentStructure.EntryData> entries, Context context) {
List<DocumentTree.Entry> newEntries = new LinkedList<>(); List<DocumentTree.Entry> newEntries = new ArrayList<>(entries.size());
for (DocumentStructure.EntryData entryData : entries) { for (DocumentStructure.EntryData entryData : entries) {
List<Page> pages = Arrays.stream(entryData.getPageNumbers()).map(pageNumber -> getPage(pageNumber, context)).toList(); List<Page> pages = Arrays.stream(entryData.getPageNumbers())
.map(pageNumber -> getPage(pageNumber, context))
.toList();
SemanticNode node = switch (entryData.getType()) { SemanticNode node = switch (entryData.getType()) {
case SECTION -> buildSection(context); case SECTION -> buildSection(context);
@ -75,7 +80,8 @@ public class DocumentGraphMapper {
TextBlock textBlock = toTextBlock(entryData.getAtomicBlockIds(), context, node); TextBlock textBlock = toTextBlock(entryData.getAtomicBlockIds(), context, node);
node.setLeafTextBlock(textBlock); node.setLeafTextBlock(textBlock);
} }
List<Integer> treeId = Arrays.stream(entryData.getTreeId()).boxed().toList(); List<Integer> treeId = Arrays.stream(entryData.getTreeId()).boxed()
.toList();
node.setTreeId(treeId); node.setTreeId(treeId);
switch (entryData.getType()) { switch (entryData.getType()) {
@ -148,16 +154,18 @@ public class DocumentGraphMapper {
private TextBlock toTextBlock(Long[] atomicTextBlockIds, Context context, SemanticNode parent) { private TextBlock toTextBlock(Long[] atomicTextBlockIds, Context context, SemanticNode parent) {
return Arrays.stream(atomicTextBlockIds).map(atomicTextBlockId -> getAtomicTextBlock(context, parent, atomicTextBlockId)).collect(new TextBlockCollector()); return Arrays.stream(atomicTextBlockIds)
.map(atomicTextBlockId -> getAtomicTextBlock(context, parent, atomicTextBlockId))
.collect(new TextBlockCollector());
} }
private AtomicTextBlock getAtomicTextBlock(Context context, SemanticNode parent, Long atomicTextBlockId) { private AtomicTextBlock getAtomicTextBlock(Context context, SemanticNode parent, Long atomicTextBlockId) {
return AtomicTextBlock.fromAtomicTextBlockData(context.documentTextData.get(Math.toIntExact(atomicTextBlockId)), return AtomicTextBlock.fromAtomicTextBlockData(context.documentTextData.get(Math.toIntExact(atomicTextBlockId)),
context.documentPositionData.get(Math.toIntExact(atomicTextBlockId)), context.documentPositionData.get(Math.toIntExact(atomicTextBlockId)),
parent, parent,
getPage(context.documentTextData.get(Math.toIntExact(atomicTextBlockId)).getPage(), context)); getPage(context.documentTextData.get(Math.toIntExact(atomicTextBlockId)).getPage(), context));
} }
@ -171,8 +179,7 @@ public class DocumentGraphMapper {
return context.pageData.stream() return context.pageData.stream()
.filter(page -> page.getNumber() == Math.toIntExact(pageIndex)) .filter(page -> page.getNumber() == Math.toIntExact(pageIndex))
.findFirst() .findFirst().orElseThrow(() -> new NoSuchElementException(String.format("ClassificationPage with number %d not found", pageIndex)));
.orElseThrow(() -> new NoSuchElementException(String.format("ClassificationPage with number %d not found", pageIndex)));
} }
@ -188,8 +195,10 @@ public class DocumentGraphMapper {
this.documentTree = documentTree; this.documentTree = documentTree;
this.pageData = new LinkedList<>(); this.pageData = new LinkedList<>();
this.documentTextData = Arrays.stream(documentData.getDocumentTextData()).toList(); this.documentTextData = Arrays.stream(documentData.getDocumentTextData())
this.documentPositionData = Arrays.stream(documentData.getDocumentPositionData()).toList(); .toList();
this.documentPositionData = Arrays.stream(documentData.getDocumentPositionData())
.toList();
} }

View File

@ -1,14 +1,18 @@
package com.iqser.red.service.redaction.v1.server.service.document; package com.iqser.red.service.redaction.v1.server.service.document;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.*; import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.addEntityToNodeEntitySets;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.addToPages;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.allEntitiesIntersectAndHaveSameTypes;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.checkIfBothStartAndEndAreEmpty;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.findIntersectingSubNodes;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.toLineAfterTextRange;
import static com.iqser.red.service.redaction.v1.server.service.document.EntityCreationUtility.truncateEndIfLineBreakIsBetween;
import static com.iqser.red.service.redaction.v1.server.utils.SeparatorUtils.boundaryIsSurroundedBySeparators; import static com.iqser.red.service.redaction.v1.server.utils.SeparatorUtils.boundaryIsSurroundedBySeparators;
import java.util.Collection; import java.util.Collection;
import java.util.Collections;
import java.util.Comparator; import java.util.Comparator;
import java.util.LinkedList; import java.util.LinkedList;
import java.util.List; import java.util.List;
import java.util.NoSuchElementException;
import java.util.Optional; import java.util.Optional;
import java.util.Set; import java.util.Set;
import java.util.stream.Collectors; import java.util.stream.Collectors;
@ -276,7 +280,8 @@ public class EntityCreationService {
"this is some text. a here is more text" and "here is more text". We only want to keep the latter. "this is some text. a here is more text" and "here is more text". We only want to keep the latter.
*/ */
return entityTextRanges.stream() return entityTextRanges.stream()
.filter(boundary -> entityTextRanges.stream().noneMatch(innerBoundary -> !innerBoundary.equals(boundary) && innerBoundary.containedBy(boundary))) .filter(boundary -> entityTextRanges.stream()
.noneMatch(innerBoundary -> !innerBoundary.equals(boundary) && innerBoundary.containedBy(boundary)))
.toList(); .toList();
} }
@ -351,10 +356,10 @@ public class EntityCreationService {
return tableNode.streamTableCells() return tableNode.streamTableCells()
.flatMap(tableCell -> lineAfterBoundariesAcrossColumns(RedactionSearchUtility.findTextRangesByString(string, tableCell.getTextBlock()), .flatMap(tableCell -> lineAfterBoundariesAcrossColumns(RedactionSearchUtility.findTextRangesByString(string, tableCell.getTextBlock()),
tableCell, tableCell,
type, type,
entityType, entityType,
tableNode)); tableNode));
} }
@ -362,10 +367,10 @@ public class EntityCreationService {
return tableNode.streamTableCells() return tableNode.streamTableCells()
.flatMap(tableCell -> lineAfterBoundariesAcrossColumns(RedactionSearchUtility.findTextRangesByStringIgnoreCase(string, tableCell.getTextBlock()), .flatMap(tableCell -> lineAfterBoundariesAcrossColumns(RedactionSearchUtility.findTextRangesByStringIgnoreCase(string, tableCell.getTextBlock()),
tableCell, tableCell,
type, type,
entityType, entityType,
tableNode)); tableNode));
} }
@ -500,7 +505,10 @@ public class EntityCreationService {
public Stream<TextEntity> bySemanticNodeParagraphsOnly(SemanticNode node, String type, EntityType entityType) { public Stream<TextEntity> bySemanticNodeParagraphsOnly(SemanticNode node, String type, EntityType entityType) {
return node.streamAllSubNodesOfType(NodeType.PARAGRAPH).map(semanticNode -> bySemanticNode(semanticNode, type, entityType)).filter(Optional::isPresent).map(Optional::get); return node.streamAllSubNodesOfType(NodeType.PARAGRAPH)
.map(semanticNode -> bySemanticNode(semanticNode, type, entityType))
.filter(Optional::isPresent)
.map(Optional::get);
} }
@ -590,11 +598,18 @@ public class EntityCreationService {
throw new IllegalArgumentException(String.format("%s is not in the %s of the provided semantic node %s", textRange, node.getTextRange(), node)); throw new IllegalArgumentException(String.format("%s is not in the %s of the provided semantic node %s", textRange, node.getTextRange(), node));
} }
TextRange trimmedTextRange = textRange.trim(node.getTextBlock()); TextRange trimmedTextRange = textRange.trim(node.getTextBlock());
if (trimmedTextRange.length() == 0) {
return Optional.empty();
}
TextEntity entity = TextEntity.initialEntityNode(trimmedTextRange, type, entityType, node); TextEntity entity = TextEntity.initialEntityNode(trimmedTextRange, type, entityType, node);
if (node.getEntities().contains(entity)) { if (node.getEntities().contains(entity)) {
Optional<TextEntity> optionalTextEntity = node.getEntities().stream().filter(e -> e.equals(entity) && e.type().equals(type)).peek(e -> e.addEngines(engines)).findAny(); Optional<TextEntity> optionalTextEntity = node.getEntities()
.stream()
.filter(e -> e.equals(entity) && e.type().equals(type))
.peek(e -> e.addEngines(engines))
.findAny();
if (optionalTextEntity.isEmpty()) { if (optionalTextEntity.isEmpty()) {
return optionalTextEntity; // Entity has been recategorized and should not be created at all. return Optional.empty(); // Entity has been recategorized and should not be created at all.
} }
TextEntity existingEntity = optionalTextEntity.get(); TextEntity existingEntity = optionalTextEntity.get();
if (existingEntity.getTextRange().equals(textRange)) { if (existingEntity.getTextRange().equals(textRange)) {
@ -606,7 +621,7 @@ public class EntityCreationService {
} }
return Optional.empty(); // Entity has been resized, if there are duplicates they should be treated there return Optional.empty(); // Entity has been resized, if there are duplicates they should be treated there
} }
addEntityToGraph(entity, node); addEntityToGraph(entity, node.getDocumentTree());
entity.addEngines(engines); entity.addEngines(engines);
insertToKieSession(entity); insertToKieSession(entity);
return Optional.of(entity); return Optional.of(entity);
@ -635,6 +650,8 @@ public class EntityCreationService {
} }
// Do not use anymore. This might not work correctly due to duplicate textranges not being taken into account here.
@Deprecated(forRemoval = true)
public TextEntity mergeEntitiesOfSameType(List<TextEntity> entitiesToMerge, String type, EntityType entityType, SemanticNode node) { public TextEntity mergeEntitiesOfSameType(List<TextEntity> entitiesToMerge, String type, EntityType entityType, SemanticNode node) {
if (!allEntitiesIntersectAndHaveSameTypes(entitiesToMerge)) { if (!allEntitiesIntersectAndHaveSameTypes(entitiesToMerge)) {
@ -647,27 +664,44 @@ public class EntityCreationService {
return entitiesToMerge.get(0); return entitiesToMerge.get(0);
} }
TextEntity mergedEntity = TextEntity.initialEntityNode(TextRange.merge(entitiesToMerge.stream().map(TextEntity::getTextRange).toList()), type, entityType, node); TextEntity mergedEntity = TextEntity.initialEntityNode(TextRange.merge(entitiesToMerge.stream()
mergedEntity.addEngines(entitiesToMerge.stream().flatMap(entityNode -> entityNode.getEngines().stream()).collect(Collectors.toSet())); .map(TextEntity::getTextRange)
entitiesToMerge.stream().map(TextEntity::getMatchedRuleList).flatMap(Collection::stream).forEach(matchedRule -> mergedEntity.getMatchedRuleList().add(matchedRule)); .toList()), type, entityType, node);
mergedEntity.addEngines(entitiesToMerge.stream()
.flatMap(entityNode -> entityNode.getEngines()
.stream())
.collect(Collectors.toSet()));
entitiesToMerge.stream()
.map(TextEntity::getMatchedRuleList)
.flatMap(Collection::stream)
.forEach(matchedRule -> mergedEntity.getMatchedRuleList().add(matchedRule));
entitiesToMerge.stream() entitiesToMerge.stream()
.map(TextEntity::getManualOverwrite) .map(TextEntity::getManualOverwrite)
.map(ManualChangeOverwrite::getManualChangeLog) .map(ManualChangeOverwrite::getManualChangeLog)
.flatMap(Collection::stream) .flatMap(Collection::stream)
.forEach(manualChange -> mergedEntity.getManualOverwrite().addChange(manualChange)); .forEach(manualChange -> mergedEntity.getManualOverwrite().addChange(manualChange));
mergedEntity.setDictionaryEntry(entitiesToMerge.stream().anyMatch(TextEntity::isDictionaryEntry)); mergedEntity.setDictionaryEntry(entitiesToMerge.stream()
mergedEntity.setDossierDictionaryEntry(entitiesToMerge.stream().anyMatch(TextEntity::isDossierDictionaryEntry)); .anyMatch(TextEntity::isDictionaryEntry));
mergedEntity.setDossierDictionaryEntry(entitiesToMerge.stream()
.anyMatch(TextEntity::isDossierDictionaryEntry));
entityEnrichmentService.enrichEntity(mergedEntity, node.getTextBlock());
addEntityToGraph(mergedEntity, node); addEntityToGraph(mergedEntity, node);
insertToKieSession(mergedEntity); insertToKieSession(mergedEntity);
entitiesToMerge.stream()
.filter(e -> !e.equals(mergedEntity))
.forEach(node.getEntities()::remove);
return mergedEntity; return mergedEntity;
} }
public Stream<TextEntity> copyEntities(List<TextEntity> entities, String type, EntityType entityType, SemanticNode node) { public Stream<TextEntity> copyEntities(List<TextEntity> entities, String type, EntityType entityType, SemanticNode node) {
return entities.stream().map(entity -> copyEntity(entity, type, entityType, node)); return entities.stream()
.map(entity -> copyEntity(entity, type, entityType, node));
} }
@ -741,25 +775,19 @@ public class EntityCreationService {
public void addEntityToGraph(TextEntity entity, SemanticNode node) { public void addEntityToGraph(TextEntity entity, SemanticNode node) {
DocumentTree documentTree = node.getDocumentTree(); DocumentTree documentTree = node.getDocumentTree();
try { if (node.getEntities().contains(entity)) {
if (node.getEntities().contains(entity)) { // If entity already exists and it has a different text range, we add the text range to the list of duplicated text ranges
// If entity already exists and it has a different text range, we add the text range to the list of duplicated text ranges node.getEntities()
node.getEntities().stream()// .stream()//
.filter(e -> e.equals(entity))// .filter(e -> e.equals(entity))//
.filter(e -> !e.getTextRange().equals(entity.getTextRange()))// .filter(e -> !e.getTextRange().equals(entity.getTextRange()))//
.findAny()// .findAny()
.ifPresent(entityToDuplicate -> addDuplicateEntityToGraph(entityToDuplicate, entity.getTextRange(), node)); .ifPresent(e -> addDuplicateEntityToGraph(e, entity.getTextRange(), node));
} else {
entity.addIntersectingNode(documentTree.getRoot().getNode()); } else {
addEntityToGraph(entity, documentTree); addEntityToGraph(entity, documentTree);
}
} catch (NoSuchElementException e) {
entity.setDeepestFullyContainingNode(documentTree.getRoot().getNode());
entityEnrichmentService.enrichEntity(entity, entity.getDeepestFullyContainingNode().getTextBlock());
entity.addIntersectingNode(documentTree.getRoot().getNode());
addToPages(entity);
addEntityToNodeEntitySets(entity);
} }
} }
@ -770,10 +798,11 @@ public class EntityCreationService {
SemanticNode deepestSharedNode = entityToDuplicate.getIntersectingNodes() SemanticNode deepestSharedNode = entityToDuplicate.getIntersectingNodes()
.stream() .stream()
.sorted(Comparator.comparingInt(n -> -n.getTreeId().size())) .sorted(Comparator.comparingInt(n -> -n.getTreeId().size()))
.filter(intersectingNode -> entityToDuplicate.getDuplicateTextRanges().stream().allMatch(tr -> intersectingNode.getTextRange().contains(tr)) && // .filter(intersectingNode -> entityToDuplicate.getDuplicateTextRanges()
intersectingNode.getTextRange().contains(entityToDuplicate.getTextRange())) .stream()
.findFirst() .allMatch(tr -> intersectingNode.getTextRange().contains(tr)) && //
.orElse(node.getDocumentTree().getRoot().getNode()); intersectingNode.getTextRange().contains(entityToDuplicate.getTextRange()))
.findFirst().orElse(node.getDocumentTree().getRoot().getNode());
entityToDuplicate.setDeepestFullyContainingNode(deepestSharedNode); entityToDuplicate.setDeepestFullyContainingNode(deepestSharedNode);
@ -784,7 +813,8 @@ public class EntityCreationService {
return; return;
} }
additionalIntersectingNode.getEntities().add(entityToDuplicate); additionalIntersectingNode.getEntities().add(entityToDuplicate);
additionalIntersectingNode.getPages(newTextRange).forEach(page -> page.getEntities().add(entityToDuplicate)); additionalIntersectingNode.getPages(newTextRange)
.forEach(page -> page.getEntities().add(entityToDuplicate));
entityToDuplicate.addIntersectingNode(additionalIntersectingNode); entityToDuplicate.addIntersectingNode(additionalIntersectingNode);
}); });
} }
@ -792,12 +822,7 @@ public class EntityCreationService {
private void addEntityToGraph(TextEntity entity, DocumentTree documentTree) { private void addEntityToGraph(TextEntity entity, DocumentTree documentTree) {
SemanticNode containingNode = documentTree.childNodes(Collections.emptyList()) documentTree.getRoot().getNode().addThisToEntityIfIntersects(entity);
.filter(node -> node.getTextBlock().containsTextRange(entity.getTextRange()))
.findFirst()
.orElseThrow(() -> new NoSuchElementException("No containing Node found!"));
containingNode.addThisToEntityIfIntersects(entity);
TextBlock textBlock = entity.getDeepestFullyContainingNode().getTextBlock(); TextBlock textBlock = entity.getDeepestFullyContainingNode().getTextBlock();
entityEnrichmentService.enrichEntity(entity, textBlock); entityEnrichmentService.enrichEntity(entity, textBlock);
@ -806,5 +831,4 @@ public class EntityCreationService {
addEntityToNodeEntitySets(entity); addEntityToNodeEntitySets(entity);
} }
} }

View File

@ -47,7 +47,9 @@ public class EntityFindingUtility {
} }
public Optional<TextEntity> findClosestEntityAndReturnEmptyIfNotFound(PrecursorEntity precursorEntity, Map<String, List<TextEntity>> entitiesWithSameValue, double matchThreshold) { public Optional<TextEntity> findClosestEntityAndReturnEmptyIfNotFound(PrecursorEntity precursorEntity,
Map<String, List<TextEntity>> entitiesWithSameValue,
double matchThreshold) {
if (precursorEntity.getValue() == null) { if (precursorEntity.getValue() == null) {
return Optional.empty(); return Optional.empty();
@ -56,7 +58,7 @@ public class EntityFindingUtility {
List<TextEntity> possibleEntities = entitiesWithSameValue.get(precursorEntity.getValue().toLowerCase(Locale.ENGLISH)); List<TextEntity> possibleEntities = entitiesWithSameValue.get(precursorEntity.getValue().toLowerCase(Locale.ENGLISH));
if (entityIdentifierValueNotFound(possibleEntities)) { if (entityIdentifierValueNotFound(possibleEntities)) {
log.warn("Entity could not be created with precursorEntity: {}, due to the value {} not being found anywhere.", precursorEntity, precursorEntity.getValue()); log.info("Entity could not be created with precursorEntity: {}, due to the value {} not being found anywhere.", precursorEntity, precursorEntity.getValue());
return Optional.empty(); return Optional.empty();
} }
@ -66,18 +68,22 @@ public class EntityFindingUtility {
.min(Comparator.comparingDouble(ClosestEntity::getDistance)); .min(Comparator.comparingDouble(ClosestEntity::getDistance));
if (optionalClosestEntity.isEmpty()) { if (optionalClosestEntity.isEmpty()) {
log.warn("No Entity with value {} found on page {}", precursorEntity.getValue(), precursorEntity.getEntityPosition()); log.info("No Entity with value {} found on page {}", precursorEntity.getValue(), precursorEntity.getEntityPosition());
return Optional.empty(); return Optional.empty();
} }
ClosestEntity closestEntity = optionalClosestEntity.get(); ClosestEntity closestEntity = optionalClosestEntity.get();
if (closestEntity.getDistance() > matchThreshold) { if (closestEntity.getDistance() > matchThreshold) {
log.warn("For entity {} on page {} with positions {} distance to closest found entity is {} and therefore higher than the threshold of {}", log.info("For entity {} on page {} with positions {} distance to closest found entity is {} and therefore higher than the threshold of {}",
precursorEntity.getValue(), precursorEntity.getValue(),
precursorEntity.getEntityPosition().get(0).pageNumber(), precursorEntity.getEntityPosition()
precursorEntity.getEntityPosition().stream().map(RectangleWithPage::rectangle2D).toList(), .get(0).pageNumber(),
closestEntity.getDistance(), precursorEntity.getEntityPosition()
matchThreshold); .stream()
.map(RectangleWithPage::rectangle2D)
.toList(),
closestEntity.getDistance(),
matchThreshold);
return Optional.empty(); return Optional.empty();
} }
@ -93,8 +99,14 @@ public class EntityFindingUtility {
private static boolean pagesMatch(TextEntity entity, List<RectangleWithPage> originalPositions) { private static boolean pagesMatch(TextEntity entity, List<RectangleWithPage> originalPositions) {
Set<Integer> entityPageNumbers = entity.getPositionsOnPagePerPage().stream().map(PositionOnPage::getPage).map(Page::getNumber).collect(Collectors.toSet()); Set<Integer> entityPageNumbers = entity.getPositionsOnPagePerPage()
Set<Integer> originalPageNumbers = originalPositions.stream().map(RectangleWithPage::pageNumber).collect(Collectors.toSet()); .stream()
.map(PositionOnPage::getPage)
.map(Page::getNumber)
.collect(Collectors.toSet());
Set<Integer> originalPageNumbers = originalPositions.stream()
.map(RectangleWithPage::pageNumber)
.collect(Collectors.toSet());
return entityPageNumbers.containsAll(originalPageNumbers); return entityPageNumbers.containsAll(originalPageNumbers);
} }
@ -105,15 +117,16 @@ public class EntityFindingUtility {
return Double.MAX_VALUE; return Double.MAX_VALUE;
} }
return originalPositions.stream() return originalPositions.stream()
.mapToDouble(rectangleWithPage -> calculateMinDistancePerRectangle(entity, rectangleWithPage.pageNumber(), rectangleWithPage.rectangle2D())) .mapToDouble(rectangleWithPage -> calculateMinDistancePerRectangle(entity, rectangleWithPage.pageNumber(), rectangleWithPage.rectangle2D())).average()
.average()
.orElse(Double.MAX_VALUE); .orElse(Double.MAX_VALUE);
} }
private static long countRectangles(TextEntity entity) { private static long countRectangles(TextEntity entity) {
return entity.getPositionsOnPagePerPage().stream().mapToLong(redactionPosition -> redactionPosition.getRectanglePerLine().size()).sum(); return entity.getPositionsOnPagePerPage()
.stream()
.mapToLong(redactionPosition -> redactionPosition.getRectanglePerLine().size()).sum();
} }
@ -161,7 +174,8 @@ public class EntityFindingUtility {
pageNumbers.stream().filter(pageNumber -> !node.onPage(pageNumber)).toList(), pageNumbers.stream().filter(pageNumber -> !node.onPage(pageNumber)).toList(),
node.getPages())); node.getPages()));
} }
SearchImplementation searchImplementation = new SearchImplementation(entryValues, true);
SearchImplementation searchImplementation = new SearchImplementation(entryValues.stream().map(String::trim).collect(Collectors.toSet()), true);
return searchImplementation.getBoundaries(node.getTextBlock(), node.getTextRange()) return searchImplementation.getBoundaries(node.getTextBlock(), node.getTextRange())
.stream() .stream()

View File

@ -9,7 +9,6 @@ import java.util.Optional;
import java.util.Set; import java.util.Set;
import java.util.stream.Collectors; import java.util.stream.Collectors;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.imported.ImportedRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.imported.ImportedRedactions;
@ -23,46 +22,34 @@ import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNo
import com.iqser.red.service.redaction.v1.server.service.DictionaryService; import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import lombok.AccessLevel; import lombok.AccessLevel;
import lombok.RequiredArgsConstructor;
import lombok.experimental.FieldDefaults; import lombok.experimental.FieldDefaults;
import lombok.extern.slf4j.Slf4j; import lombok.extern.slf4j.Slf4j;
@Slf4j @Slf4j
@Service @Service
@RequiredArgsConstructor
@FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE) @FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE)
public class EntityFromPrecursorCreationService { public class EntityFromPrecursorCreationService {
static double MATCH_THRESHOLD = 10; // Is compared to the average sum of distances in pdf coordinates for each corner of the bounding box of the entities static double MATCH_THRESHOLD = 10; // Is compared to the average sum of distances in pdf coordinates for each corner of the bounding box of the entities
EntityFindingUtility entityFindingUtility; EntityFindingUtility entityFindingUtility;
EntityCreationService entityCreationService;
DictionaryService dictionaryService; DictionaryService dictionaryService;
@Autowired
public EntityFromPrecursorCreationService(EntityEnrichmentService entityEnrichmentService, DictionaryService dictionaryService, EntityFindingUtility entityFindingUtility) {
this.entityFindingUtility = entityFindingUtility;
entityCreationService = new EntityCreationService(entityEnrichmentService);
this.dictionaryService = dictionaryService;
}
public List<PrecursorEntity> createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions manualRedactions, SemanticNode node, String dossierTemplateId) { public List<PrecursorEntity> createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions manualRedactions, SemanticNode node, String dossierTemplateId) {
Set<IdRemoval> idRemovals = manualRedactions.getIdsToRemove(); Set<IdRemoval> idRemovals = manualRedactions.getIdsToRemove();
List<PrecursorEntity> manualEntities = manualRedactions.getEntriesToAdd() List<PrecursorEntity> manualEntities = manualRedactions.getEntriesToAdd()
.stream() .stream()
.filter(manualRedactionEntry -> !(idRemovals.stream() .filter(BaseAnnotation::isLocal)
.map(BaseAnnotation::getAnnotationId) .filter(manualRedactionEntry -> idRemovals.stream()
.toList() .filter(idRemoval -> idRemoval.getAnnotationId().equals(manualRedactionEntry.getAnnotationId()))
.contains(manualRedactionEntry.getAnnotationId()) && manualRedactionEntry.getRequestDate() .filter(idRemoval -> idRemoval.getRequestDate().isAfter(manualRedactionEntry.getRequestDate()))
.isBefore(idRemovals.stream() .findAny()//
.filter(idRemoval -> idRemoval.getAnnotationId().equals(manualRedactionEntry.getAnnotationId())) .isEmpty())
.findFirst() .map(manualRedactionEntry -> //
.get() PrecursorEntity.fromManualRedactionEntry(manualRedactionEntry, dictionaryService.isHint(manualRedactionEntry.getType(), dossierTemplateId)))
.getRequestDate())))
.filter(manualRedactionEntry -> !(manualRedactionEntry.isAddToDictionary() || manualRedactionEntry.isAddToDossierDictionary()))
.map(manualRedactionEntry -> PrecursorEntity.fromManualRedactionEntry(manualRedactionEntry,
dictionaryService.isHint(manualRedactionEntry.getType(), dossierTemplateId)))
.peek(manualEntity -> { .peek(manualEntity -> {
if (manualEntity.getEntityType().equals(EntityType.HINT)) { if (manualEntity.getEntityType().equals(EntityType.HINT)) {
manualEntity.skip("MAN.5.1", "manual hint is skipped by default"); manualEntity.skip("MAN.5.1", "manual hint is skipped by default");
@ -71,7 +58,6 @@ public class EntityFromPrecursorCreationService {
} }
}) })
.toList(); .toList();
return toTextEntity(manualEntities, node); return toTextEntity(manualEntities, node);
} }
@ -90,8 +76,14 @@ public class EntityFromPrecursorCreationService {
public List<PrecursorEntity> toTextEntity(List<PrecursorEntity> precursorEntities, SemanticNode node) { public List<PrecursorEntity> toTextEntity(List<PrecursorEntity> precursorEntities, SemanticNode node) {
var notFoundEntities = precursorEntities.stream().filter(PrecursorEntity::isRectangle).collect(Collectors.toList()); var notFoundEntities = precursorEntities.stream()
var findableEntities = precursorEntities.stream().filter(precursorEntity -> !precursorEntity.isRectangle()).toList(); .filter(PrecursorEntity::isRectangle)
.collect(Collectors.toList());
var findableEntities = precursorEntities.stream()
.filter(precursorEntity -> !precursorEntity.isRectangle())
.toList();
Map<String, List<TextEntity>> tempEntitiesByValue = entityFindingUtility.findAllPossibleEntitiesAndGroupByValue(node, findableEntities); Map<String, List<TextEntity>> tempEntitiesByValue = entityFindingUtility.findAllPossibleEntitiesAndGroupByValue(node, findableEntities);
for (PrecursorEntity precursorEntity : findableEntities) { for (PrecursorEntity precursorEntity : findableEntities) {
@ -102,7 +94,12 @@ public class EntityFromPrecursorCreationService {
} }
createCorrectEntity(precursorEntity, optionalClosestEntity.get()); createCorrectEntity(precursorEntity, optionalClosestEntity.get());
} }
tempEntitiesByValue.values().stream().flatMap(Collection::stream).forEach(TextEntity::removeFromGraph);
tempEntitiesByValue.values()
.stream()
.flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph);
return notFoundEntities; return notFoundEntities;
} }
@ -113,9 +110,23 @@ public class EntityFromPrecursorCreationService {
* @param precursorEntity The entity identifier for the RedactionEntity. * @param precursorEntity The entity identifier for the RedactionEntity.
* @param closestEntity The closest Boundary to the RedactionEntity. * @param closestEntity The closest Boundary to the RedactionEntity.
*/ */
private void createCorrectEntity(PrecursorEntity precursorEntity, TextEntity closestEntity) { public static TextEntity createCorrectEntity(PrecursorEntity precursorEntity, TextEntity closestEntity) {
TextEntity correctEntity = TextEntity.initialEntityNode(closestEntity.getTextRange(), precursorEntity.type(), precursorEntity.getEntityType(), precursorEntity.getId()); return createCorrectEntity(precursorEntity, closestEntity, false);
}
public static TextEntity createCorrectEntity(PrecursorEntity precursorEntity, TextEntity closestEntity, boolean generateId) {
TextEntity correctEntity;
if (generateId) {
correctEntity = TextEntity.initialEntityNode(closestEntity.getTextRange(),
precursorEntity.type(),
precursorEntity.getEntityType(),
closestEntity.getDeepestFullyContainingNode());
} else {
correctEntity = TextEntity.initialEntityNode(closestEntity.getTextRange(), precursorEntity.type(), precursorEntity.getEntityType(), precursorEntity.getId());
}
correctEntity.setDeepestFullyContainingNode(closestEntity.getDeepestFullyContainingNode()); correctEntity.setDeepestFullyContainingNode(closestEntity.getDeepestFullyContainingNode());
correctEntity.setIntersectingNodes(new ArrayList<>(closestEntity.getIntersectingNodes())); correctEntity.setIntersectingNodes(new ArrayList<>(closestEntity.getIntersectingNodes()));
correctEntity.setDuplicateTextRanges(new ArrayList<>(closestEntity.getDuplicateTextRanges())); correctEntity.setDuplicateTextRanges(new ArrayList<>(closestEntity.getDuplicateTextRanges()));
@ -125,14 +136,17 @@ public class EntityFromPrecursorCreationService {
correctEntity.setTextAfter(closestEntity.getTextAfter()); correctEntity.setTextAfter(closestEntity.getTextAfter());
correctEntity.setTextBefore(closestEntity.getTextBefore()); correctEntity.setTextBefore(closestEntity.getTextBefore());
correctEntity.getIntersectingNodes().forEach(n -> n.getEntities().add(correctEntity)); correctEntity.getIntersectingNodes()
correctEntity.getPages().forEach(page -> page.getEntities().add(correctEntity)); .forEach(n -> n.getEntities().add(correctEntity));
correctEntity.getPages()
.forEach(page -> page.getEntities().add(correctEntity));
correctEntity.addMatchedRules(precursorEntity.getMatchedRuleList()); correctEntity.addMatchedRules(precursorEntity.getMatchedRuleList());
correctEntity.setDictionaryEntry(precursorEntity.isDictionaryEntry()); correctEntity.setDictionaryEntry(precursorEntity.isDictionaryEntry());
correctEntity.setDossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry()); correctEntity.setDossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry());
correctEntity.getManualOverwrite().addChanges(precursorEntity.getManualOverwrite().getManualChangeLog()); correctEntity.getManualOverwrite().addChanges(precursorEntity.getManualOverwrite().getManualChangeLog());
correctEntity.addEngines(precursorEntity.getEngines()); correctEntity.addEngines(precursorEntity.getEngines());
return correctEntity;
} }
} }

View File

@ -50,7 +50,9 @@ public class ComponentDroolsExecutionService {
.filter(entityLogEntry -> entityLogEntry.getState().equals(EntryState.APPLIED)) .filter(entityLogEntry -> entityLogEntry.getState().equals(EntryState.APPLIED))
.map(entry -> Entity.fromEntityLogEntry(entry, document)) .map(entry -> Entity.fromEntityLogEntry(entry, document))
.forEach(kieSession::insert); .forEach(kieSession::insert);
fileAttributes.stream().filter(f -> f.getValue() != null).forEach(kieSession::insert); fileAttributes.stream()
.filter(f -> f.getValue() != null)
.forEach(kieSession::insert);
CompletableFuture<Void> completableFuture = CompletableFuture.supplyAsync(() -> { CompletableFuture<Void> completableFuture = CompletableFuture.supplyAsync(() -> {
kieSession.fireAllRules(); kieSession.fireAllRules();
@ -58,7 +60,8 @@ public class ComponentDroolsExecutionService {
}); });
try { try {
completableFuture.orTimeout(settings.getDroolsExecutionTimeoutSecs(), TimeUnit.SECONDS).get(); completableFuture.orTimeout(settings.getDroolsExecutionTimeoutSecs(document.getNumberOfPages()), TimeUnit.SECONDS)
.get();
} catch (ExecutionException e) { } catch (ExecutionException e) {
kieSession.dispose(); kieSession.dispose();
if (e.getCause() instanceof TimeoutException) { if (e.getCause() instanceof TimeoutException) {
@ -71,7 +74,9 @@ public class ComponentDroolsExecutionService {
} }
List<FileAttribute> resultingFileAttributes = getFileAttributes(kieSession); List<FileAttribute> resultingFileAttributes = getFileAttributes(kieSession);
List<Component> components = getComponents(kieSession).stream().sorted(ComponentComparator.first()).toList(); List<Component> components = getComponents(kieSession).stream()
.sorted(ComponentComparator.first())
.toList();
kieSession.dispose(); kieSession.dispose();
return components; return components;
} }

View File

@ -16,6 +16,7 @@ import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.FileAttribute; import com.iqser.red.service.persistence.service.v1.api.shared.model.FileAttribute;
import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType; import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.redaction.v1.server.RedactionServiceSettings; import com.iqser.red.service.redaction.v1.server.RedactionServiceSettings;
import com.iqser.red.service.redaction.v1.server.model.NerEntities; import com.iqser.red.service.redaction.v1.server.model.NerEntities;
import com.iqser.red.service.redaction.v1.server.model.dictionary.Dictionary; import com.iqser.red.service.redaction.v1.server.model.dictionary.Dictionary;
@ -55,7 +56,14 @@ public class EntityDroolsExecutionService {
ManualRedactions manualRedactions, ManualRedactions manualRedactions,
NerEntities nerEntities) { NerEntities nerEntities) {
return executeRules(kieContainer, document, document.streamChildren().toList(), dictionary, fileAttributes, manualRedactions, nerEntities); return executeRules(kieContainer,
document,
document.streamChildren()
.toList(),
dictionary,
fileAttributes,
manualRedactions,
nerEntities);
} }
@ -80,19 +88,28 @@ public class EntityDroolsExecutionService {
kieSession.setGlobal("dictionary", dictionary); kieSession.setGlobal("dictionary", dictionary);
kieSession.insert(document); kieSession.insert(document);
document.getEntities().forEach(kieSession::insert);
document.getEntities()
.forEach(kieSession::insert);
sectionsToAnalyze.forEach(kieSession::insert); sectionsToAnalyze.forEach(kieSession::insert);
sectionsToAnalyze.stream().flatMap(SemanticNode::streamAllSubNodes).forEach(kieSession::insert);
document.getPages().forEach(kieSession::insert); sectionsToAnalyze.stream()
fileAttributes.stream().filter(f -> f.getValue() != null).forEach(kieSession::insert); .flatMap(SemanticNode::streamAllSubNodes)
.forEach(kieSession::insert);
document.getPages()
.forEach(kieSession::insert);
fileAttributes.stream()
.filter(f -> f.getValue() != null)
.forEach(kieSession::insert);
if (manualRedactions != null) { if (manualRedactions != null) {
manualRedactions.getResizeRedactions().forEach(kieSession::insert); manualRedactions.buildAll()
manualRedactions.getRecategorizations().forEach(kieSession::insert); .stream()
manualRedactions.getEntriesToAdd().forEach(kieSession::insert); .filter(BaseAnnotation::isLocal)
manualRedactions.getForceRedactions().forEach(kieSession::insert); .forEach(kieSession::insert);
manualRedactions.getIdsToRemove().forEach(kieSession::insert);
manualRedactions.getLegalBasisChanges().forEach(kieSession::insert);
} }
kieSession.insert(nerEntities); kieSession.insert(nerEntities);
@ -105,7 +122,8 @@ public class EntityDroolsExecutionService {
}); });
try { try {
completableFuture.orTimeout(settings.getDroolsExecutionTimeoutSecs(), TimeUnit.SECONDS).get(); completableFuture.orTimeout(settings.getDroolsExecutionTimeoutSecs(document.getNumberOfPages()), TimeUnit.SECONDS)
.get();
} catch (ExecutionException e) { } catch (ExecutionException e) {
kieSession.dispose(); kieSession.dispose();
if (e.getCause() instanceof TimeoutException) { if (e.getCause() instanceof TimeoutException) {

View File

@ -30,8 +30,7 @@ public class KieContainerCreationService {
private final RulesClient rulesClient; private final RulesClient rulesClient;
@Observed(name = "KieContainerCreationService", @Observed(name = "KieContainerCreationService", contextualName = "get-kie-container")
contextualName = "get-kie-container")
public KieWrapper getLatestKieContainer(String dossierTemplateId, RuleFileType ruleFileType) { public KieWrapper getLatestKieContainer(String dossierTemplateId, RuleFileType ruleFileType) {
try { try {
@ -65,7 +64,6 @@ public class KieContainerCreationService {
try { try {
return kieServices.newKieContainer(getReleaseId(dossierTemplateId, version, ruleFileType)); return kieServices.newKieContainer(getReleaseId(dossierTemplateId, version, ruleFileType));
} catch (Exception e) { } catch (Exception e) {
registerNewKieContainerVersion(dossierTemplateId, version, ruleFileType); registerNewKieContainerVersion(dossierTemplateId, version, ruleFileType);
return kieServices.newKieContainer(getReleaseId(dossierTemplateId, version, ruleFileType)); return kieServices.newKieContainer(getReleaseId(dossierTemplateId, version, ruleFileType));
} }

View File

@ -1,5 +1,6 @@
package com.iqser.red.service.redaction.v1.server.utils; package com.iqser.red.service.redaction.v1.server.utils;
import java.util.Collections;
import java.util.LinkedList; import java.util.LinkedList;
import java.util.Set; import java.util.Set;
import java.util.function.BiConsumer; import java.util.function.BiConsumer;
@ -17,7 +18,7 @@ public class MigratedIdsCollector implements Collector<MigrationEntity, Migrated
@Override @Override
public Supplier<MigratedIds> supplier() { public Supplier<MigratedIds> supplier() {
return () -> new MigratedIds(new LinkedList<>()); return () -> new MigratedIds(new LinkedList<>(), Collections.emptyList(), Collections.emptyList());
} }

View File

@ -0,0 +1,327 @@
package com.iqser.red.service.redaction.v1.server;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.when;
import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;
import java.util.zip.GZIPInputStream;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Disabled;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.springframework.amqp.rabbit.core.RabbitTemplate;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.boot.test.mock.mockito.MockBean;
import org.springframework.context.annotation.Import;
import org.springframework.core.io.ClassPathResource;
import org.springframework.test.context.junit.jupiter.SpringExtension;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.common.collect.Sets;
import com.iqser.red.commons.jackson.ObjectMapperFactory;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntryModel;
import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest;
import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.common.JSONPrimitive;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.dossier.file.FileType;
import com.iqser.red.service.redaction.v1.server.client.DictionaryClient;
import com.iqser.red.service.redaction.v1.server.client.LegalBasisClient;
import com.iqser.red.service.redaction.v1.server.client.RulesClient;
import com.iqser.red.service.redaction.v1.server.model.dictionary.Dictionary;
import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryIncrement;
import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryModel;
import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryVersion;
import com.iqser.red.service.redaction.v1.server.service.AnalyzeService;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService;
import com.iqser.red.service.redaction.v1.server.utils.exception.NotFoundException;
import com.iqser.red.storage.commons.service.StorageService;
import com.knecon.fforesight.tenantcommons.TenantsClient;
import lombok.SneakyThrows;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@ExtendWith(SpringExtension.class)
@SpringBootTest(classes = Application.class, webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Import(RedactionIntegrationTest.RedactionIntegrationTestConfiguration.class)
@Disabled
/*
* This test is meant to be used directly with a download from blob storage (e.g. minio). You need to define the dossier template you want to use by supplying an absolute path.
* The dossier template will then be parsed for dictionaries, colors, entities, and rules. This is defined for the all tests once.
* Inside a test you supply a path to your minio download folder. The files should still be zipped in this folder.
* The files will then be checked for completeness and uploaded to the FileSystemBackedStorageService.
* This way you can recreate what is happening on the stack almost exactly.
*/ public class AnalysisEnd2EndTest {
Path dossierTemplateToUse = Path.of("/home/kschuettler/iqser/business-logic/redactmanager/prod-cp-eu-reg/EFSA_sanitisation_GFL_v1"); // Add your dossier-template here
ObjectMapper mapper = ObjectMapperFactory.create();
final String TENANT_ID = "tenant";
@Autowired
StorageService storageService;
@Autowired
protected AnalyzeService analyzeService;
@MockBean
DictionaryService dictionaryService;
@MockBean
RabbitTemplate rabbitTemplate;
TestDossierTemplate testDossierTemplate;
@MockBean
protected LegalBasisClient legalBasisClient;
@MockBean
private TenantsClient tenantsClient;
@MockBean
protected RulesClient rulesClient;
@MockBean
protected DictionaryClient dictionaryClient;
@Test
@SneakyThrows
public void runAnalysisEnd2End() {
String folder = "files/end2end/file0"; // Should contain all files from minio directly, still zipped. Can contain multiple fileIds.
Path absoluteFolderPath;
if (folder.startsWith("files")) { // if it starts with "files" it is most likely in the resources folder, else it should be an absolute path
ClassPathResource classPathResource = new ClassPathResource(folder);
absoluteFolderPath = classPathResource.getFile().toPath();
} else {
absoluteFolderPath = Path.of(folder);
}
log.info("Starting end2end analyses for all distinct filenames in folder: {}", folder);
List<AnalyzeRequest> analyzeRequests = prepareStorageForFolder(absoluteFolderPath);
log.info("Found {} distinct fileIds", analyzeRequests.size());
for (int i = 0; i < analyzeRequests.size(); i++) {
AnalyzeRequest analyzeRequest = analyzeRequests.get(i);
log.info("{}/{}: Starting analysis for file {}", i + 1, analyzeRequests.size(), analyzeRequest.getFileId());
analyzeService.analyze(analyzeRequest);
}
}
@BeforeEach
public void setup() {
testDossierTemplate = new TestDossierTemplate(dossierTemplateToUse);
when(dictionaryService.updateDictionary(any(), any())).thenReturn(new DictionaryVersion(0, 0));
when(dictionaryService.getDeepCopyDictionary(any(), any())).thenReturn(testDossierTemplate.testDictionary);
when(dictionaryService.getDictionaryIncrements(any(), any(), any())).thenReturn(new DictionaryIncrement(Collections.emptySet(), new DictionaryVersion(0, 0)));
when(dictionaryService.isHint(any(String.class), any())).thenAnswer(invocation -> {
String type = invocation.getArgument(0);
return testDossierTemplate.testDictionary.getType(type).isHint();
});
when(dictionaryService.getColor(any(String.class), any())).thenAnswer(invocation -> {
String type = invocation.getArgument(0);
return testDossierTemplate.testDictionary.getType(type).getColor();
});
when(dictionaryService.getNotRedactedColor(any())).thenReturn(new float[]{0.2f, 0.2f, 0.2f});
when(rulesClient.getVersion(testDossierTemplate.id, RuleFileType.ENTITY)).thenReturn(System.currentTimeMillis());
when(rulesClient.getRules(testDossierTemplate.id, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(testDossierTemplate.rules));
when(rulesClient.getVersion(testDossierTemplate.id, RuleFileType.COMPONENT)).thenReturn(testDossierTemplate.componentRules != null ? System.currentTimeMillis() : -1);
when(rulesClient.getRules(testDossierTemplate.id, RuleFileType.COMPONENT)).thenReturn(JSONPrimitive.of(testDossierTemplate.componentRules));
}
@SneakyThrows
private List<AnalyzeRequest> prepareStorageForFolder(Path folder) {
return Files.list(folder)
.map(this::parseFileId)
.distinct()
.map(fileId -> prepareStorageForFile(fileId, folder))
.toList();
}
private String parseFileId(Path path) {
return path.getFileName().toString().split("\\.")[0];
}
@SneakyThrows
private AnalyzeRequest prepareStorageForFile(String fileId, Path folder) {
AnalyzeRequest request = new AnalyzeRequest();
request.setDossierId(UUID.randomUUID().toString());
request.setFileId(UUID.randomUUID().toString());
request.setDossierTemplateId(testDossierTemplate.id);
request.setManualRedactions(new ManualRedactions());
request.setAnalysisNumber(-1);
Set<FileType> endingsToUpload = Set.of("ORIGIN",
"DOCUMENT_PAGES",
"DOCUMENT_POSITION",
"DOCUMENT_STRUCTURE",
"DOCUMENT_TEXT",
"IMAGE_INFO",
"NER_ENTITIES",
"TABLES",
"IMPORTED_REDACTIONS")
.stream()
.map(FileType::valueOf)
.collect(Collectors.toSet());
Set<FileType> uploadedFileTypes = Files.walk(folder)
.filter(path -> path.toFile().isFile())
.filter(path -> endingsToUpload.contains(parseFileTypeFromPath(path)))
.map(filePath -> uploadFile(filePath, request))
.collect(Collectors.toUnmodifiableSet());
Set<FileType> missingFileTypes = Sets.difference(endingsToUpload, uploadedFileTypes);
if (!missingFileTypes.isEmpty()) {
log.error("Folder {} is missing files of type {}",
folder.toFile(),
missingFileTypes.stream()
.map(Enum::toString)
.collect(Collectors.joining(", ")));
throw new NotFoundException("Not all required file types are present.");
}
return request;
}
private static FileType parseFileTypeFromPath(Path path) {
return FileType.valueOf(path.getFileName().toString().split("\\.")[1]);
}
@SneakyThrows
private FileType uploadFile(Path path, AnalyzeRequest request) {
FileType fileType = parseFileTypeFromPath(path);
try (var fis = new FileInputStream(path.toFile()); var in = new GZIPInputStream(fis);) {
storageService.storeObject(TENANT_ID, RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), fileType), in);
}
return fileType;
}
private class TestDossierTemplate {
String id;
Dictionary testDictionary;
AtomicInteger dictEntryIdCounter = new AtomicInteger(0);
String rules;
String componentRules;
@SneakyThrows
TestDossierTemplate(Path dossierTemplateToUse) {
Map<String, Object> dossierTemplate = mapper.readValue(dossierTemplateToUse.resolve("dossierTemplate.json").toFile(), HashMap.class);
this.id = (String) dossierTemplate.get("dossierTemplateId");
List<DictionaryModel> dictionaries = Files.walk(dossierTemplateToUse)
.filter(path -> path.getFileName().toString().equals("dossierType.json"))
.map(this::loadDictionaryModel)
.toList();
File ruleFile = dossierTemplateToUse.resolve("rules.drl").toFile();
rules = new String(Files.readAllBytes(ruleFile.toPath()));
File componentRuleFile = dossierTemplateToUse.resolve("componentRules.drl").toFile();
if (componentRuleFile.exists()) {
componentRules = new String(Files.readAllBytes(componentRuleFile.toPath()));
}
testDictionary = new Dictionary(dictionaries, new DictionaryVersion(0, 0));
}
@SneakyThrows
private DictionaryModel loadDictionaryModel(Path path) {
Map<String, Object> model = mapper.readValue(path.toFile(), HashMap.class);
Set<DictionaryEntryModel> entries = new HashSet<>();
Set<DictionaryEntryModel> falsePositives = new HashSet<>();
Set<DictionaryEntryModel> falseRecommendations = new HashSet<>();
String type = (String) model.get("type");
Integer rank = (Integer) model.get("rank");
float[] color = hexToFloatArr((String) model.get("hexColor"));
Boolean caseInsensitive = (Boolean) model.get("caseInsensitive");
Boolean hint = (Boolean) model.get("hint");
Boolean hasDictionary = (Boolean) model.get("hasDictionary");
boolean isDossierDictionary;
if (model.containsKey("dossierDictionaryOnly")) {
isDossierDictionary = true;
} else {
isDossierDictionary = ((String) model.get("id")).split(":").length == 3;
}
if (hasDictionary) {
try (var in = new FileInputStream(path.getParent().resolve("entries.txt").toFile())) {
entries.addAll(parseDictionaryEntryModelFromFile(new String(in.readAllBytes()), dictEntryIdCounter, (String) model.get("typeId")));
}
try (var in = new FileInputStream(path.getParent().resolve("falsePositives.txt").toFile())) {
falsePositives.addAll(parseDictionaryEntryModelFromFile(new String(in.readAllBytes()), dictEntryIdCounter, (String) model.get("typeId")));
}
try (var in = new FileInputStream(path.getParent().resolve("falseRecommendations.txt").toFile())) {
falseRecommendations.addAll(parseDictionaryEntryModelFromFile(new String(in.readAllBytes()), dictEntryIdCounter, (String) model.get("typeId")));
}
}
return new DictionaryModel(type, rank, color, caseInsensitive, hint, entries, falsePositives, falseRecommendations, isDossierDictionary);
}
private Set<DictionaryEntryModel> parseDictionaryEntryModelFromFile(String s, AtomicInteger dictEntryIdCounter, String typeId) {
String[] values = s.split("\n");
return Arrays.stream(values)
.map(value -> new DictionaryEntryModel(dictEntryIdCounter.getAndIncrement(), value, 0L, false, typeId))
.collect(Collectors.toUnmodifiableSet());
}
private float[] hexToFloatArr(String hexColor) {
// Remove # symbol if present
String cleanHexColor = hexColor.replace("#", "");
// Parse hex string into RGB components
int r = Integer.parseInt(cleanHexColor.substring(0, 2), 16);
int g = Integer.parseInt(cleanHexColor.substring(2, 4), 16);
int b = Integer.parseInt(cleanHexColor.substring(4, 6), 16);
// Normalize RGB values to floats between 0 and 1
float[] rgbFloat = new float[3];
rgbFloat[0] = r / 255.0f;
rgbFloat[1] = g / 255.0f;
rgbFloat[2] = b / 255.0f;
return rgbFloat;
}
}
}

View File

@ -31,6 +31,7 @@ import org.springframework.test.context.junit.jupiter.SpringExtension;
import com.fasterxml.jackson.databind.ObjectMapper; import com.fasterxml.jackson.databind.ObjectMapper;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
@ -49,7 +50,6 @@ import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.redaction.utils.OsUtils; import com.iqser.red.service.redaction.v1.server.redaction.utils.OsUtils;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService; import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility; import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility;
import com.iqser.red.service.redaction.v1.server.utils.RectangleTransformations;
import com.knecon.fforesight.tenantcommons.TenantContext; import com.knecon.fforesight.tenantcommons.TenantContext;
import lombok.SneakyThrows; import lombok.SneakyThrows;
@ -107,7 +107,7 @@ public class MigrationIntegrationTest extends BuildDocumentIntegrationTest {
@SneakyThrows @SneakyThrows
public void testSave() { public void testSave() {
MigratedIds ids = new MigratedIds(new LinkedList<>()); MigratedIds ids = new MigratedIds(new LinkedList<>(), null, null);
ids.addMapping("123", "321"); ids.addMapping("123", "321");
ids.addMapping("123", "321"); ids.addMapping("123", "321");
ids.addMapping("123", "321"); ids.addMapping("123", "321");
@ -173,7 +173,13 @@ public class MigrationIntegrationTest extends BuildDocumentIntegrationTest {
mergedRedactionLog = redactionLog; mergedRedactionLog = redactionLog;
} }
MigratedEntityLog migratedEntityLog = redactionLogToEntityLogMigrationService.migrate(mergedRedactionLog, document, TEST_DOSSIER_TEMPLATE_ID, manualRedactions); MigratedEntityLog migratedEntityLog = redactionLogToEntityLogMigrationService.migrate(mergedRedactionLog,
document,
TEST_DOSSIER_TEMPLATE_ID,
manualRedactions,
TEST_FILE_ID,
Collections.emptySet(),
false);
redactionStorageService.storeObject(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.ENTITY_LOG, migratedEntityLog.getEntityLog()); redactionStorageService.storeObject(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.ENTITY_LOG, migratedEntityLog.getEntityLog());
assertEquals(mergedRedactionLog.getRedactionLogEntry().size(), migratedEntityLog.getEntityLog().getEntityLogEntry().size()); assertEquals(mergedRedactionLog.getRedactionLogEntry().size(), migratedEntityLog.getEntityLog().getEntityLogEntry().size());
@ -187,10 +193,11 @@ public class MigrationIntegrationTest extends BuildDocumentIntegrationTest {
assertEquals(mergedRedactionLog.getLegalBasis().size(), entityLog.getLegalBasis().size()); assertEquals(mergedRedactionLog.getLegalBasis().size(), entityLog.getLegalBasis().size());
Map<String, String> migratedIds = migratedEntityLog.getMigratedIds().buildOldToNewMapping(); Map<String, String> migratedIds = migratedEntityLog.getMigratedIds().buildOldToNewMapping();
// assertEquals(legacyRedactionLogMergeService.getNumberOfAffectedAnnotations(manualRedactions), migratedIds.size());
migratedIds.forEach((oldId, newId) -> assertEntryIsEqual(oldId, newId, mergedRedactionLog, entityLog, migratedIds)); migratedIds.forEach((oldId, newId) -> assertEntryIsEqual(oldId, newId, mergedRedactionLog, entityLog, migratedIds));
AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder().dossierId(TEST_DOSSIER_ID).fileId(TEST_FILE_ID) AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder().dossierId(TEST_DOSSIER_ID).fileId(TEST_FILE_ID).build());
.build());
File outputFile = Path.of(OsUtils.getTemporaryDirectory()).resolve(Path.of(fileName.replaceAll(".pdf", "_MIGRATED.pdf")).getFileName()).toFile(); File outputFile = Path.of(OsUtils.getTemporaryDirectory()).resolve(Path.of(fileName.replaceAll(".pdf", "_MIGRATED.pdf")).getFileName()).toFile();
try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile)) { try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile)) {
@ -268,13 +275,24 @@ public class MigrationIntegrationTest extends BuildDocumentIntegrationTest {
if (!redactionLogEntry.isImage()) { if (!redactionLogEntry.isImage()) {
assertEquals(redactionLogEntry.getValue().toLowerCase(Locale.ENGLISH), entityLogEntry.getValue().toLowerCase(Locale.ENGLISH)); assertEquals(redactionLogEntry.getValue().toLowerCase(Locale.ENGLISH), entityLogEntry.getValue().toLowerCase(Locale.ENGLISH));
} }
if (entityLogEntry.getManualChanges()
.stream()
.noneMatch(mc -> mc.getManualRedactionType().equals(ManualRedactionType.RECATEGORIZE))) {
assertEquals(redactionLogEntry.getType(), entityLogEntry.getType());
}
assertEquals(redactionLogEntry.getChanges().size(), entityLogEntry.getChanges().size()); assertEquals(redactionLogEntry.getChanges().size(), entityLogEntry.getChanges().size());
assertTrue(redactionLogEntry.getManualChanges().size() <= entityLogEntry.getManualChanges().size()); assertTrue(redactionLogEntry.getManualChanges().size() <= entityLogEntry.getManualChanges().size());
assertEquals(redactionLogEntry.getPositions().size(), entityLogEntry.getPositions().size()); assertEquals(redactionLogEntry.getPositions().size(), entityLogEntry.getPositions().size());
assertTrue(positionsAlmostEqual(redactionLogEntry.getPositions(), entityLogEntry.getPositions())); if (entityLogEntry.getManualChanges()
// assertEquals(redactionLogEntry.getColor(), entityLogEntry.getColor()); .stream()
assertEqualsNullSafe(redactionLogEntry.getLegalBasis(), entityLogEntry.getLegalBasis()); .noneMatch(mc -> mc.getManualRedactionType().equals(ManualRedactionType.RESIZE) || mc.getManualRedactionType().equals(ManualRedactionType.RESIZE_IN_DICTIONARY))) {
// assertEqualsNullSafe(redactionLogEntry.getReason(), entityLogEntry.getReason()); assertTrue(positionsAlmostEqual(redactionLogEntry.getPositions(), entityLogEntry.getPositions()));
}
if (entityLogEntry.getManualChanges()
.stream()
.noneMatch(mc -> mc.getManualRedactionType().equals(ManualRedactionType.FORCE))) {
assertEqualsNullSafe(redactionLogEntry.getLegalBasis(), entityLogEntry.getLegalBasis());
}
assertReferencesEqual(redactionLogEntry.getReference(), entityLogEntry.getReference(), oldToNewMapping); assertReferencesEqual(redactionLogEntry.getReference(), entityLogEntry.getReference(), oldToNewMapping);
assertEquals(redactionLogEntry.isDictionaryEntry(), entityLogEntry.isDictionaryEntry()); assertEquals(redactionLogEntry.isDictionaryEntry(), entityLogEntry.isDictionaryEntry());
assertEquals(redactionLogEntry.isDossierDictionaryEntry(), entityLogEntry.isDossierDictionaryEntry()); assertEquals(redactionLogEntry.isDossierDictionaryEntry(), entityLogEntry.isDossierDictionaryEntry());

View File

@ -87,15 +87,15 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L); when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L);
when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder() when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder()
.id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID) .id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID)
.type(DOSSIER_REDACTIONS_INDICATOR) .type(DOSSIER_REDACTIONS_INDICATOR)
.dossierTemplateId(TEST_DOSSIER_ID) .dossierTemplateId(TEST_DOSSIER_ID)
.hexColor("#ffe187") .hexColor("#ffe187")
.isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.build())); .build()));
mockDictionaryCalls(null); mockDictionaryCalls(null);
@ -122,9 +122,12 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
assertThat(recommendations).containsExactlyInAnyOrder("Michael N.", "Funnarie B.", "Feuer A."); assertThat(recommendations).containsExactlyInAnyOrder("Michael N.", "Funnarie B.", "Feuer A.");
} }
@Test @Test
public void acceptanceTests() throws IOException { public void acceptanceTests() throws IOException {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1_moreSections.pdf"); AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1_moreSections.pdf");
System.out.println("Start Full integration test"); System.out.println("Start Full integration test");
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
@ -133,8 +136,12 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
System.out.println("Finished analysis"); System.out.println("Finished analysis");
EntityLog entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); EntityLog entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var publishedInformationEntry1 = findEntityByTypeAndValue(entityLog, "published_information", "Oxford University Press").findFirst().orElseThrow(); var publishedInformationEntry1 = findEntityByTypeAndValue(entityLog, "published_information", "Oxford University Press").findFirst()
var asyaLyon1 = findEntityByTypeAndValueAndSectionNumber(entityLog, "CBI_author", "Asya Lyon", publishedInformationEntry1.getContainingNodeId()).findFirst().orElseThrow(); .orElseThrow();
assertThat(publishedInformationEntry1.getSection().startsWith("Paragraph:"));
var asyaLyon1 = findEntityByTypeAndValueAndSectionNumber(entityLog, "CBI_author", "Asya Lyon", publishedInformationEntry1.getContainingNodeId()).findFirst()
.orElseThrow();
assertThat(publishedInformationEntry1.getSection().startsWith("Paragraph:"));
assertEquals(EntryState.SKIPPED, asyaLyon1.getState()); assertEquals(EntryState.SKIPPED, asyaLyon1.getState());
@ -146,8 +153,10 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var publishedInformationEntry2 = findEntityByTypeAndValue(entityLog, "published_information", "Oxford University Press").findFirst().orElseThrow(); var publishedInformationEntry2 = findEntityByTypeAndValue(entityLog, "published_information", "Oxford University Press").findFirst()
var asyaLyon2 = findEntityByTypeAndValueAndSectionNumber(entityLog, "CBI_author", "Asya Lyon", publishedInformationEntry2.getContainingNodeId()).findFirst().orElseThrow(); .orElseThrow();
var asyaLyon2 = findEntityByTypeAndValueAndSectionNumber(entityLog, "CBI_author", "Asya Lyon", publishedInformationEntry2.getContainingNodeId()).findFirst()
.orElseThrow();
assertEquals(EntryState.APPLIED, asyaLyon2.getState()); assertEquals(EntryState.APPLIED, asyaLyon2.getState());
@ -168,19 +177,25 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
.stream() .stream()
.filter(entry -> entry.getType().equals(type)) .filter(entry -> entry.getType().equals(type))
.filter(entry -> entry.getValue().equals(value)) .filter(entry -> entry.getValue().equals(value))
.filter(entry -> entry.getContainingNodeId().get(0).equals(sectionNumber.get(0))); .filter(entry -> entry.getContainingNodeId()
.get(0).equals(sectionNumber.get(0)));
} }
private static Stream<EntityLogEntry> findEntityByTypeAndValue(EntityLog redactionLog, String type, String value) { private static Stream<EntityLogEntry> findEntityByTypeAndValue(EntityLog redactionLog, String type, String value) {
return redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getType().equals(type)).filter(entry -> entry.getValue().equals(value)); return redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getType().equals(type))
.filter(entry -> entry.getValue().equals(value));
} }
@Test @Test
public void noEndlessLoopsTest() { public void noEndlessLoopsTest() {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1_moreSections.pdf"); AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1_moreSections.pdf");
System.out.println("Start Full integration test"); System.out.println("Start Full integration test");
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
@ -201,13 +216,15 @@ public class RedactionAcceptanceTest extends AbstractRedactionIntegrationTest {
var redactionLog2 = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var redactionLog2 = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertEquals(EntryState.IGNORED, assertEquals(EntryState.IGNORED,
findEntityByTypeAndValue(redactionLog2, "CBI_author", "Desiree").filter(entry -> entry.getEntryType().equals(EntryType.ENTITY)).findFirst().get().getState()); findEntityByTypeAndValue(redactionLog2, "CBI_author", "Desiree").filter(entry -> entry.getEntryType().equals(EntryType.ENTITY))
.findFirst()
.get().getState());
} }
private static IdRemoval buildIdRemoval(String id) { private static IdRemoval buildIdRemoval(String id) {
return IdRemoval.builder().annotationId(id).requestDate(OffsetDateTime.now()).fileId(TEST_FILE_ID).build(); return IdRemoval.builder().annotationId(id).user("user").requestDate(OffsetDateTime.now()).fileId(TEST_FILE_ID).build();
} }
} }

View File

@ -119,15 +119,15 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L); when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L);
when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder() when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder()
.id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID) .id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID)
.type(DOSSIER_REDACTIONS_INDICATOR) .type(DOSSIER_REDACTIONS_INDICATOR)
.dossierTemplateId(TEST_DOSSIER_ID) .dossierTemplateId(TEST_DOSSIER_ID)
.hexColor("#ffe187") .hexColor("#ffe187")
.isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.build())); .build()));
mockDictionaryCalls(null); mockDictionaryCalls(null);
@ -169,9 +169,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
entityLog.getEntityLogEntry().forEach(entry -> { entityLog.getEntityLogEntry()
duplicates.computeIfAbsent(entry.getId(), v -> new ArrayList<>()).add(entry); .forEach(entry -> {
}); duplicates.computeIfAbsent(entry.getId(), v -> new ArrayList<>()).add(entry);
});
duplicates.forEach((key, value) -> assertThat(value.size()).isEqualTo(1)); duplicates.forEach((key, value) -> assertThat(value.size()).isEqualTo(1));
@ -216,12 +217,14 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ManualRedactions manualRedactions = ManualRedactions.builder() ManualRedactions manualRedactions = ManualRedactions.builder()
.resizeRedactions(Set.of(ManualResizeRedaction.builder() .resizeRedactions(Set.of(ManualResizeRedaction.builder()
.annotationId("c6be5277f5ee60dc3d83527798b7fe02") .annotationId("c6be5277f5ee60dc3d83527798b7fe02")
.value("Dr. Alan") .fileId(TEST_FILE_ID)
.positions(List.of(new Rectangle(236.8f, 182.90005f, 40.584f, 12.642f, 7))) .value("Dr. Alan")
.requestDate(OffsetDateTime.now()) .positions(List.of(new Rectangle(236.8f, 182.90005f, 40.584f, 12.642f, 7)))
.updateDictionary(false) .requestDate(OffsetDateTime.now())
.build())) .updateDictionary(false)
.user("user")
.build()))
.build(); .build();
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
@ -256,7 +259,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var values = redactionLog.getEntityLogEntry().stream().map(EntityLogEntry::getValue).collect(Collectors.toList()); var values = redactionLog.getEntityLogEntry()
.stream()
.map(EntityLogEntry::getValue)
.collect(Collectors.toList());
assertThat(values).containsExactlyInAnyOrder("Lastname M.", "Doe", "Doe J.", "M. Mustermann", "Mustermann M.", "F. Lastname"); assertThat(values).containsExactlyInAnyOrder("Lastname M.", "Doe", "Doe J.", "M. Mustermann", "Mustermann M.", "F. Lastname");
} }
@ -268,8 +274,8 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ClassPathResource importedRedactionClasspathResource = new ClassPathResource( ClassPathResource importedRedactionClasspathResource = new ClassPathResource(
"files/ImportedRedactions/18 Chlorothalonil RAR 08 Volume 3CA B 6a Oct 2017.IMPORTED_REDACTIONS.json"); "files/ImportedRedactions/18 Chlorothalonil RAR 08 Volume 3CA B 6a Oct 2017.IMPORTED_REDACTIONS.json");
storageService.storeObject(TenantContext.getTenantId(), storageService.storeObject(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMPORTED_REDACTIONS), RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMPORTED_REDACTIONS),
importedRedactionClasspathResource.getInputStream()); importedRedactionClasspathResource.getInputStream());
AnalyzeRequest request = uploadFileToStorage("files/ImportedRedactions/18 Chlorothalonil RAR 08 Volume 3CA B 6a Oct 2017.pdf"); AnalyzeRequest request = uploadFileToStorage("files/ImportedRedactions/18 Chlorothalonil RAR 08 Volume 3CA B 6a Oct 2017.pdf");
System.out.println("Start Full integration test"); System.out.println("Start Full integration test");
@ -353,10 +359,18 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var mergedEntityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var mergedEntityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var cbiAddressBeforeHintRemoval = entityLog.getEntityLogEntry().stream().filter(re -> re.getType().equalsIgnoreCase("CBI_Address")).findAny().get(); var cbiAddressBeforeHintRemoval = entityLog.getEntityLogEntry()
.stream()
.filter(re -> re.getType().equalsIgnoreCase("CBI_Address"))
.findAny()
.get();
assertThat(cbiAddressBeforeHintRemoval.getState().equals(EntryState.APPLIED)).isFalse(); assertThat(cbiAddressBeforeHintRemoval.getState().equals(EntryState.APPLIED)).isFalse();
var cbiAddressAfterHintRemoval = mergedEntityLog.getEntityLogEntry().stream().filter(re -> re.getType().equalsIgnoreCase("CBI_Address")).findAny().get(); var cbiAddressAfterHintRemoval = mergedEntityLog.getEntityLogEntry()
.stream()
.filter(re -> re.getType().equalsIgnoreCase("CBI_Address"))
.findAny()
.get();
assertThat(cbiAddressAfterHintRemoval.getState().equals(EntryState.APPLIED)).isTrue(); assertThat(cbiAddressAfterHintRemoval.getState().equals(EntryState.APPLIED)).isTrue();
} }
@ -386,9 +400,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
entityLog.getEntityLogEntry().forEach(entry -> { entityLog.getEntityLogEntry()
duplicates.computeIfAbsent(entry.getId(), v -> new ArrayList<>()).add(entry); .forEach(entry -> {
}); duplicates.computeIfAbsent(entry.getId(), v -> new ArrayList<>()).add(entry);
});
duplicates.forEach((id, redactionLogEntries) -> assertThat(redactionLogEntries.size()).isEqualTo(1)); duplicates.forEach((id, redactionLogEntries) -> assertThat(redactionLogEntries.size()).isEqualTo(1));
@ -421,11 +436,11 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
AnalyzeRequest request = uploadFileToStorage(fileName); AnalyzeRequest request = uploadFileToStorage(fileName);
request.setFileAttributes(List.of(FileAttribute.builder() request.setFileAttributes(List.of(FileAttribute.builder()
.id("fileAttributeId") .id("fileAttributeId")
.label("Vertebrate Study") .label("Vertebrate Study")
.placeholder("{fileattributes.vertebrateStudy}") .placeholder("{fileattributes.vertebrateStudy}")
.value("true") .value("true")
.build())); .build()));
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
AnalyzeResult result = analyzeService.analyze(request); AnalyzeResult result = analyzeService.analyze(request);
@ -449,7 +464,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
correctFound++; correctFound++;
continue loop; continue loop;
} }
if (Objects.equals(entityLogEntry.getContainingNodeId().get(0), section.getTreeId().get(0))) { if (Objects.equals(entityLogEntry.getContainingNodeId()
.get(0),
section.getTreeId()
.get(0))) {
String value = section.getTextBlock().subSequence(new TextRange(entityLogEntry.getStartOffset(), entityLogEntry.getEndOffset())).toString(); String value = section.getTextBlock().subSequence(new TextRange(entityLogEntry.getStartOffset(), entityLogEntry.getEndOffset())).toString();
if (entityLogEntry.getValue().equalsIgnoreCase(value)) { if (entityLogEntry.getValue().equalsIgnoreCase(value)) {
correctFound++; correctFound++;
@ -481,12 +499,12 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
manualRedactions.setEntriesToAdd(Set.of(ManualRedactionEntry.builder() manualRedactions.setEntriesToAdd(Set.of(ManualRedactionEntry.builder()
.value("Redact") .value("Redact")
.addToDictionary(true) .addToDictionary(true)
.addToDossierDictionary(true) .addToDossierDictionary(true)
.positions(List.of(new Rectangle(new Point(95.96979999999999f, 515.7984f), 19.866899999999987f, 46.953f, 2))) .positions(List.of(new Rectangle(new Point(95.96979999999999f, 515.7984f), 19.866899999999987f, 46.953f, 2)))
.type("dossier_redaction") .type("dossier_redaction")
.build())); .build()));
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
@ -548,7 +566,11 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var changes = entityLog.getEntityLogEntry().stream().filter(entry -> entry.getValue() != null && entry.getValue().equals("report")).findFirst().get().getChanges(); var changes = entityLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue() != null && entry.getValue().equals("report"))
.findFirst()
.get().getChanges();
assertThat(changes.size()).isEqualTo(2); assertThat(changes.size()).isEqualTo(2);
@ -563,23 +585,25 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
@Test @Test
public void redactionTest() throws IOException { public void redactionTest() throws IOException {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
String fileName = "files/new/crafted document.pdf"; String fileName = "files/new/crafted document.pdf";
String outputFileName = OsUtils.getTemporaryDirectory() + "/Annotated.pdf"; String outputFileName = OsUtils.getTemporaryDirectory() + "/Annotated.pdf";
ClassPathResource responseJson = new ClassPathResource("files/crafted_document.NER_ENTITIES.json"); ClassPathResource responseJson = new ClassPathResource("files/crafted_document.NER_ENTITIES.json");
storageService.storeObject(TenantContext.getTenantId(), storageService.storeObject(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.NER_ENTITIES), RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.NER_ENTITIES),
responseJson.getInputStream()); responseJson.getInputStream());
long start = System.currentTimeMillis(); long start = System.currentTimeMillis();
AnalyzeRequest request = uploadFileToStorage(fileName); AnalyzeRequest request = uploadFileToStorage(fileName);
request.setFileAttributes(List.of(FileAttribute.builder() request.setFileAttributes(List.of(FileAttribute.builder()
.id("fileAttributeId") .id("fileAttributeId")
.label("Vertebrate Study") .label("Vertebrate Study")
.placeholder("{fileattributes.vertebrateStudy}") .placeholder("{fileattributes.vertebrateStudy}")
.value("true") .value("true")
.build())); .build()));
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
AnalyzeResult result = analyzeService.analyze(request); AnalyzeResult result = analyzeService.analyze(request);
@ -601,7 +625,11 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
.map(redactionLogEntry -> new TextRange(redactionLogEntry.getStartOffset(), redactionLogEntry.getEndOffset())) .map(redactionLogEntry -> new TextRange(redactionLogEntry.getStartOffset(), redactionLogEntry.getEndOffset()))
.map(boundary -> documentGraph.getTextBlock().subSequence(boundary).toString()) .map(boundary -> documentGraph.getTextBlock().subSequence(boundary).toString())
.toList(); .toList();
List<String> valuesInRedactionLog = entityLog.getEntityLogEntry().stream().filter(e -> !e.getEntryType().equals(EntryType.IMAGE)).map(EntityLogEntry::getValue).toList(); List<String> valuesInRedactionLog = entityLog.getEntityLogEntry()
.stream()
.filter(e -> !e.getEntryType().equals(EntryType.IMAGE))
.map(EntityLogEntry::getValue)
.toList();
assertEquals(valuesInRedactionLog, valuesInDocument); assertEquals(valuesInRedactionLog, valuesInDocument);
@ -628,11 +656,12 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
manualRedactions.setRecategorizations(Set.of(ManualRecategorization.builder() manualRedactions.setRecategorizations(Set.of(ManualRecategorization.builder()
.annotationId("37eee3e9d589a5cc529bfec38c3ba479") .annotationId("37eee3e9d589a5cc529bfec38c3ba479")
.fileId("fileId") .fileId("fileId")
.type("signature") .type("signature")
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.build())); .user("user")
.build()));
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
@ -683,40 +712,43 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
manualRedactions.getIdsToRemove() manualRedactions.getIdsToRemove()
.add(IdRemoval.builder() .add(IdRemoval.builder()
.annotationId("308dab9015bfafd911568cffe0a7f7de") .annotationId("308dab9015bfafd911568cffe0a7f7de")
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.requestDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 07, 475479, ZoneOffset.UTC)) .requestDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 07, 475479, ZoneOffset.UTC))
.processedDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 07, 483651, ZoneOffset.UTC)) .user("user")
.build()); .processedDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 07, 483651, ZoneOffset.UTC))
.build());
manualRedactions.getForceRedactions() manualRedactions.getForceRedactions()
.add(ManualForceRedaction.builder() .add(ManualForceRedaction.builder()
.annotationId("0b56ea1a87c83f351df177315af94f0d") .annotationId("0b56ea1a87c83f351df177315af94f0d")
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.legalBasis("Something") .legalBasis("Something")
.requestDate(OffsetDateTime.of(2022, 05, 23, 9, 30, 15, 4653, ZoneOffset.UTC)) .user("user")
.processedDate(OffsetDateTime.of(2022, 05, 23, 9, 30, 15, 794, ZoneOffset.UTC)) .requestDate(OffsetDateTime.of(2022, 05, 23, 9, 30, 15, 4653, ZoneOffset.UTC))
.build()); .processedDate(OffsetDateTime.of(2022, 05, 23, 9, 30, 15, 794, ZoneOffset.UTC))
.build());
manualRedactions.getIdsToRemove() manualRedactions.getIdsToRemove()
.add(IdRemoval.builder() .add(IdRemoval.builder()
.annotationId("0b56ea1a87c83f351df177315af94f0d") .annotationId("0b56ea1a87c83f351df177315af94f0d")
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.requestDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 23, 961721, ZoneOffset.UTC)) .user("user")
.processedDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 23, 96528, ZoneOffset.UTC)) .requestDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 23, 961721, ZoneOffset.UTC))
.build()); .processedDate(OffsetDateTime.of(2022, 05, 23, 8, 30, 23, 96528, ZoneOffset.UTC))
.build());
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
AnalyzeResult result = analyzeService.analyze(request); AnalyzeResult result = analyzeService.analyze(request);
AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder() AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder()
.manualRedactions(manualRedactions) .manualRedactions(manualRedactions)
.colors(colors) .colors(colors)
.types(types) .types(types)
.dossierId(TEST_DOSSIER_ID) .dossierId(TEST_DOSSIER_ID)
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.build()); .build());
try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) { try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
@ -921,6 +953,7 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
.textBefore("") .textBefore("")
.updateDictionary(false) .updateDictionary(false)
.textAfter("") .textAfter("")
.user("user")
.build(); .build();
manualRedactions.getResizeRedactions().add(manualResizeRedaction); manualRedactions.getResizeRedactions().add(manualResizeRedaction);
@ -932,12 +965,12 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder() AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder()
.manualRedactions(manualRedactions) .manualRedactions(manualRedactions)
.colors(colors) .colors(colors)
.types(types) .types(types)
.dossierId(TEST_DOSSIER_ID) .dossierId(TEST_DOSSIER_ID)
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.build()); .build());
try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) { try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
@ -960,15 +993,16 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
redactionLog.getEntityLogEntry().forEach(entry -> { redactionLog.getEntityLogEntry()
if (!entry.getEntryType().equals(EntryType.HINT)) { .forEach(entry -> {
if (entry.getType().equals("CBI_author")) { if (!entry.getEntryType().equals(EntryType.HINT)) {
assertThat(entry.getReason()).isEqualTo("Not redacted because it's row does not belong to a vertebrate study"); if (entry.getType().equals("CBI_author")) {
} else if (entry.getType().equals("CBI_address")) { assertThat(entry.getReason()).isEqualTo("Not redacted because it's row does not belong to a vertebrate study");
assertThat(entry.getReason()).isEqualTo("No vertebrate found"); } else if (entry.getType().equals("CBI_address")) {
} assertThat(entry.getReason()).isEqualTo("No vertebrate found");
} }
}); }
});
} }
@ -1005,18 +1039,20 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
String manualAddId = UUID.randomUUID().toString(); String manualAddId = UUID.randomUUID().toString();
manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder() manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder()
.annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf") .annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf")
.fileId("fileId") .fileId("fileId")
.processedDate(OffsetDateTime.now()) .user("user")
.requestDate(OffsetDateTime.now()) .processedDate(OffsetDateTime.now())
.build())); .requestDate(OffsetDateTime.now())
.build()));
manualRedactions.setForceRedactions(Set.of(ManualForceRedaction.builder() manualRedactions.setForceRedactions(Set.of(ManualForceRedaction.builder()
.annotationId("675eba69b0c2917de55462c817adaa05") .annotationId("675eba69b0c2917de55462c817adaa05")
.fileId("fileId") .fileId("fileId")
.legalBasis("Something") .user("user")
.requestDate(OffsetDateTime.now()) .legalBasis("Something")
.processedDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.build())); .processedDate(OffsetDateTime.now())
.build()));
ManualRedactionEntry manualRedactionEntry = new ManualRedactionEntry(); ManualRedactionEntry manualRedactionEntry = new ManualRedactionEntry();
manualRedactionEntry.setAnnotationId(manualAddId); manualRedactionEntry.setAnnotationId(manualAddId);
@ -1027,7 +1063,7 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
manualRedactionEntry.setProcessedDate(OffsetDateTime.now()); manualRedactionEntry.setProcessedDate(OffsetDateTime.now());
manualRedactionEntry.setRequestDate(OffsetDateTime.now()); manualRedactionEntry.setRequestDate(OffsetDateTime.now());
manualRedactionEntry.setPositions(List.of(Rectangle.builder().topLeftX(375.61096f).topLeftY(241.282f).width(7.648041f).height(43.72262f).page(1).build(), manualRedactionEntry.setPositions(List.of(Rectangle.builder().topLeftX(375.61096f).topLeftY(241.282f).width(7.648041f).height(43.72262f).page(1).build(),
Rectangle.builder().topLeftX(384.83517f).topLeftY(241.282f).width(7.648041f).height(17.043358f).page(1).build())); Rectangle.builder().topLeftX(384.83517f).topLeftY(241.282f).width(7.648041f).height(17.043358f).page(1).build()));
// manualRedactions.getEntriesToAdd().add(manualRedactionEntry); // manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
@ -1038,39 +1074,63 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder() manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder()
.annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf") .annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf")
.fileId("fileId") .fileId("fileId")
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.processedDate(OffsetDateTime.now()) .processedDate(OffsetDateTime.now())
.build())); .build()));
manualRedactions.setLegalBasisChanges((Set.of(ManualLegalBasisChange.builder() manualRedactions.setLegalBasisChanges((Set.of(ManualLegalBasisChange.builder()
.annotationId("675eba69b0c2917de55462c817adaa05") .annotationId("675eba69b0c2917de55462c817adaa05")
.fileId("fileId") .fileId("fileId")
.legalBasis("Manual Legal Basis Change") .legalBasis("Manual Legal Basis Change")
.processedDate(OffsetDateTime.now()) .processedDate(OffsetDateTime.now())
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.build()))); .build())));
manualRedactions.setResizeRedactions(Set.of(ManualResizeRedaction.builder() manualRedactions.setResizeRedactions(Set.of(ManualResizeRedaction.builder()
.annotationId("fc287b74be2421156ab2895c7474ccdd") .annotationId("fc287b74be2421156ab2895c7474ccdd")
.fileId("fileId") .fileId("fileId")
.processedDate(OffsetDateTime.now()) .processedDate(OffsetDateTime.now())
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.value("Syngenta Crop Protection AG, Basel, Switzerland RCC Ltd., Itingen, Switzerland") .value("Syngenta Crop Protection AG, Basel, Switzerland RCC Ltd., Itingen, Switzerland")
.positions(List.of(Rectangle.builder().topLeftX(289.44595f).topLeftY(327.567f).width(7.648041f).height(82.51475f).page(1).build(), .positions(List.of(Rectangle.builder()
Rectangle.builder().topLeftX(298.67056f).topLeftY(327.567f).width(7.648041f).height(75.32377f).page(1).build(), .topLeftX(289.44595f)
Rectangle.builder().topLeftX(307.89517f).topLeftY(327.567f).width(7.648041f).height(61.670967f).page(1).build(), .topLeftY(327.567f)
Rectangle.builder().topLeftX(316.99985f).topLeftY(327.567f).width(7.648041f).height(38.104286f).page(1).build())) .width(7.648041f)
.updateDictionary(false) .height(82.51475f)
.build())); .page(1)
.build(),
Rectangle.builder()
.topLeftX(298.67056f)
.topLeftY(327.567f)
.width(7.648041f)
.height(75.32377f)
.page(1)
.build(),
Rectangle.builder()
.topLeftX(307.89517f)
.topLeftY(327.567f)
.width(7.648041f)
.height(61.670967f)
.page(1)
.build(),
Rectangle.builder()
.topLeftX(316.99985f)
.topLeftY(327.567f)
.width(7.648041f)
.height(38.104286f)
.page(1)
.build()))
.updateDictionary(false)
.build()));
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder() AnnotateResponse annotateResponse = annotationService.annotate(AnnotateRequest.builder()
.dossierId(TEST_DOSSIER_ID) .dossierId(TEST_DOSSIER_ID)
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.dossierTemplateId(TEST_DOSSIER_TEMPLATE_ID) .dossierTemplateId(TEST_DOSSIER_TEMPLATE_ID)
.manualRedactions(manualRedactions) .manualRedactions(manualRedactions)
.build()); .build());
try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) { try (FileOutputStream fileOutputStream = new FileOutputStream(OsUtils.getTemporaryDirectory() + "/Annotated.pdf")) {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
@ -1110,8 +1170,8 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
AnalyzeRequest request = uploadFileToStorage("files/ImportedRedactions/RotateTestFile_without_highlights.pdf"); AnalyzeRequest request = uploadFileToStorage("files/ImportedRedactions/RotateTestFile_without_highlights.pdf");
storageService.storeObject(TenantContext.getTenantId(), storageService.storeObject(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMPORTED_REDACTIONS), RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMPORTED_REDACTIONS),
importedRedactions.getInputStream()); importedRedactions.getInputStream());
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
AnalyzeResult result = analyzeService.analyze(request); AnalyzeResult result = analyzeService.analyze(request);
@ -1124,17 +1184,18 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
} }
entityLog.getEntityLogEntry().forEach(entry -> { entityLog.getEntityLogEntry()
if (entry.getValue() == null) { .forEach(entry -> {
return; if (entry.getValue() == null) {
} return;
if (entry.getValue().equals("David")) { }
assertThat(entry.getImportedRedactionIntersections()).hasSize(1); if (entry.getValue().equals("David")) {
} assertThat(entry.getImportedRedactionIntersections()).hasSize(1);
if (entry.getValue().equals("annotation")) { }
assertThat(entry.getImportedRedactionIntersections()).isEmpty(); if (entry.getValue().equals("annotation")) {
} assertThat(entry.getImportedRedactionIntersections()).isEmpty();
}); }
});
} }
@ -1163,7 +1224,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
} }
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var values = entityLog.getEntityLogEntry().stream().map(EntityLogEntry::getValue).collect(Collectors.toList()); var values = entityLog.getEntityLogEntry()
.stream()
.map(EntityLogEntry::getValue)
.collect(Collectors.toList());
assertThat(values).contains("Mrs. Robinson"); assertThat(values).contains("Mrs. Robinson");
assertThat(values).contains("Mr. Bojangles"); assertThat(values).contains("Mr. Bojangles");
@ -1174,12 +1238,14 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
@Test @Test
public void signaturesAreRedactionAfterReanalyse() throws IOException { public void signaturesAreRedactionAfterReanalyse() throws IOException {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).pdf"); AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).pdf");
ClassPathResource imageServiceResponseFileResource = new ClassPathResource("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).IMAGE_INFO.json"); ClassPathResource imageServiceResponseFileResource = new ClassPathResource("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).IMAGE_INFO.json");
storageService.storeObject(TenantContext.getTenantId(), storageService.storeObject(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMAGE_INFO), RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMAGE_INFO),
imageServiceResponseFileResource.getInputStream()); imageServiceResponseFileResource.getInputStream());
System.out.println("Start Full integration test"); System.out.println("Start Full integration test");
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
@ -1188,35 +1254,41 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
System.out.println("Finished analysis"); System.out.println("Finished analysis");
request.setManualRedactions(ManualRedactions.builder() request.setManualRedactions(ManualRedactions.builder()
.legalBasisChanges(Set.of(ManualLegalBasisChange.builder() .legalBasisChanges(Set.of(ManualLegalBasisChange.builder()
.annotationId("3029651d0842a625f2d23f8375c23600") .annotationId("3029651d0842a625f2d23f8375c23600")
.section("[19, 2]: Paragraph: Contact point: LexCo Contact:") .section("[19, 2]: Paragraph: Contact point: LexCo Contact:")
.value("0049 331 441 551 14") .value("0049 331 441 551 14")
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.legalBasis("Article 39(e)(2) of Regulation (EC) No 178/2002") .legalBasis("Article 39(e)(2) of Regulation (EC) No 178/2002")
.build())) .user("user")
.build()); .build()))
.build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
System.out.println("Finished reanalysis"); System.out.println("Finished reanalysis");
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getType().equals("signature")).forEach(entityLogEntry -> { entityLog.getEntityLogEntry()
assertThat(entityLogEntry.getState() == EntryState.APPLIED).isTrue(); .stream()
}); .filter(entityLogEntry -> entityLogEntry.getType().equals("signature"))
.forEach(entityLogEntry -> {
assertThat(entityLogEntry.getState() == EntryState.APPLIED).isTrue();
});
} }
@Test @Test
public void entityIsAppliedAfterRecateorize() throws IOException { public void entityIsAppliedAfterRecateorize() throws IOException {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).pdf"); AnalyzeRequest request = uploadFileToStorage("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).pdf");
ClassPathResource imageServiceResponseFileResource = new ClassPathResource("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).IMAGE_INFO.json"); ClassPathResource imageServiceResponseFileResource = new ClassPathResource("files/new/SYNGENTA_EFSA_sanitisation_GFL_v1 (1).IMAGE_INFO.json");
storageService.storeObject(TenantContext.getTenantId(), storageService.storeObject(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMAGE_INFO), RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.IMAGE_INFO),
imageServiceResponseFileResource.getInputStream()); imageServiceResponseFileResource.getInputStream());
System.out.println("Start Full integration test"); System.out.println("Start Full integration test");
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request); analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
@ -1225,21 +1297,23 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
System.out.println("Finished analysis"); System.out.println("Finished analysis");
request.setManualRedactions(ManualRedactions.builder() request.setManualRedactions(ManualRedactions.builder()
.legalBasisChanges(Set.of(ManualLegalBasisChange.builder() .legalBasisChanges(Set.of(ManualLegalBasisChange.builder()
.annotationId("3029651d0842a625f2d23f8375c23600") .annotationId("3029651d0842a625f2d23f8375c23600")
.section("[19, 2]: Paragraph: Contact point: LexCo Contact:") .section("[19, 2]: Paragraph: Contact point: LexCo Contact:")
.value("0049 331 441 551 14") .value("0049 331 441 551 14")
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.legalBasis("Article 39(e)(2) of Regulation (EC) No 178/2002") .legalBasis("Article 39(e)(2) of Regulation (EC) No 178/2002")
.build())) .user("user")
.recategorizations(Set.of(ManualRecategorization.builder() .build()))
.annotationId("3029651d0842a625f2d23f8375c23600") .recategorizations(Set.of(ManualRecategorization.builder()
.type("CBI_author") .annotationId("3029651d0842a625f2d23f8375c23600")
.requestDate(OffsetDateTime.now()) .type("CBI_author")
.fileId(TEST_FILE_ID) .requestDate(OffsetDateTime.now())
.build())) .fileId(TEST_FILE_ID)
.build()); .user("user")
.build()))
.build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
System.out.println("Finished reanalysis"); System.out.println("Finished reanalysis");
@ -1266,11 +1340,11 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
String manualAddId2 = UUID.randomUUID().toString(); String manualAddId2 = UUID.randomUUID().toString();
List<Rectangle> positions = List.of(Rectangle.builder().topLeftX(305.35f).topLeftY(332.5033f).width(71.40744f).height(13.645125f).page(1).build()); List<Rectangle> positions = List.of(Rectangle.builder().topLeftX(305.35f).topLeftY(332.5033f).width(71.40744f).height(13.645125f).page(1).build());
ManualRedactionEntry manualRedactionEntry = getManualRedactionEntry(manualAddId, ManualRedactionEntry manualRedactionEntry = getManualRedactionEntry(manualAddId,
positions, positions,
"the manufacturing or production process, including the method and innovative aspects thereof, as well as other technical and industrial specifications inherent to that process or method, except for information which is relevant to the assessment of safety"); "the manufacturing or production process, including the method and innovative aspects thereof, as well as other technical and industrial specifications inherent to that process or method, except for information which is relevant to the assessment of safety");
ManualRedactionEntry manualRedactionEntry2 = getManualRedactionEntry(manualAddId2, ManualRedactionEntry manualRedactionEntry2 = getManualRedactionEntry(manualAddId2,
positions, positions,
"commercial information revealing sourcing, market shares or business strategy of the applicant"); "commercial information revealing sourcing, market shares or business strategy of the applicant");
IdRemoval idRemoval = getIdRemoval(manualAddId); IdRemoval idRemoval = getIdRemoval(manualAddId);
IdRemoval idRemoval2 = getIdRemoval(manualAddId2); IdRemoval idRemoval2 = getIdRemoval(manualAddId2);
@ -1282,59 +1356,173 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))); assertTrue(entityLog.getEntityLogEntry()
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findFirst().get().getState(), EntryState.APPLIED); .stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.APPLIED);
request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).idsToRemove(Set.of(idRemoval)).build()); request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).idsToRemove(Set.of(idRemoval)).build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))); assertTrue(entityLog.getEntityLogEntry()
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findFirst().get().getState(), EntryState.REMOVED); .stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.REMOVED);
request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2)).idsToRemove(Set.of(idRemoval)).build()); request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2)).idsToRemove(Set.of(idRemoval)).build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))); assertTrue(entityLog.getEntityLogEntry()
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findFirst().get().getState(), EntryState.REMOVED); .stream()
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))); .anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)).findFirst().get().getState(), EntryState.APPLIED); assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.REMOVED);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))
.findFirst()
.get().getState(), EntryState.APPLIED);
request.setManualRedactions(ManualRedactions.builder() request.setManualRedactions(ManualRedactions.builder()
.entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2)) .entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2))
.idsToRemove(Set.of(idRemoval, idRemoval2)) .idsToRemove(Set.of(idRemoval, idRemoval2))
.build()); .build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))); assertTrue(entityLog.getEntityLogEntry()
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findFirst().get().getState(), EntryState.REMOVED); .stream()
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))); .anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)).findFirst().get().getState(), EntryState.REMOVED); assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.REMOVED);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))
.findFirst()
.get().getState(), EntryState.REMOVED);
manualRedactionEntry.setRequestDate(OffsetDateTime.now()); manualRedactionEntry.setRequestDate(OffsetDateTime.now());
request.setManualRedactions(ManualRedactions.builder() request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2)).idsToRemove(Set.of(idRemoval2)).build());
.entriesToAdd(Set.of(manualRedactionEntry, manualRedactionEntry2))
.idsToRemove(Set.of(idRemoval, idRemoval2))
.build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))); assertTrue(entityLog.getEntityLogEntry()
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findFirst().get().getState(), EntryState.APPLIED); .stream()
assertTrue(entityLog.getEntityLogEntry().stream().anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))); .anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)).findFirst().get().getState(), EntryState.REMOVED); assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.APPLIED);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId2))
.findFirst()
.get().getState(), EntryState.REMOVED);
}
@Test
@SneakyThrows
public void testReAddingSameManualRedaction() {
String pdfFile = "files/new/test1S1T1.pdf";
String manualAddId = UUID.randomUUID().toString();
List<Rectangle> positions = List.of(Rectangle.builder().topLeftX(305.35f).topLeftY(332.5033f).width(71.40744f).height(13.645125f).page(1).build());
ManualRedactionEntry manualRedactionEntry = getManualRedactionEntry(manualAddId,
positions,
"the manufacturing or production process, including the method and innovative aspects thereof, as well as other technical and industrial specifications inherent to that process or method, except for information which is relevant to the assessment of safety");
IdRemoval idRemoval = getIdRemoval(manualAddId);
AnalyzeRequest request = uploadFileToStorage(pdfFile);
request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).build());
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
request.setAnalysisNumber(1);
analyzeService.analyze(request);
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.APPLIED);
manualRedactionEntry.setProcessedDate(OffsetDateTime.now());
request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).idsToRemove(Set.of(idRemoval)).build());
request.setAnalysisNumber(2);
analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.REMOVED);
manualRedactionEntry.setRequestDate(OffsetDateTime.now());
idRemoval.setProcessedDate(OffsetDateTime.now());
request.setManualRedactions(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).idsToRemove(Set.of(idRemoval)).build());
request.setAnalysisNumber(3);
analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertTrue(entityLog.getEntityLogEntry()
.stream()
.anyMatch(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)));
assertEquals(entityLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findFirst()
.get().getState(), EntryState.APPLIED);
} }
@Test @Test
@SneakyThrows @SneakyThrows
public void testResizeWithUpdateDictionaryTrue() { public void testResizeWithUpdateDictionaryTrue() {
String EFSA_SANITISATION_RULES = loadFromClassPath("drools/efsa_sanitisation.drl");
when(rulesClient.getRules(TEST_DOSSIER_TEMPLATE_ID, RuleFileType.ENTITY)).thenReturn(JSONPrimitive.of(EFSA_SANITISATION_RULES));
String pdfFile = "files/new/crafted document.pdf"; String pdfFile = "files/new/crafted document.pdf";
AnalyzeRequest request = uploadFileToStorage(pdfFile); AnalyzeRequest request = uploadFileToStorage(pdfFile);
@ -1342,23 +1530,40 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
analyzeService.analyze(request); analyzeService.analyze(request);
var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); var entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var david = entityLog.getEntityLogEntry().stream().filter(e -> e.getValue().equals("David")).findFirst().get(); var david = entityLog.getEntityLogEntry()
.stream()
.filter(e -> e.getValue().equals("David"))
.findFirst()
.get();
request.setManualRedactions(ManualRedactions.builder() request.setManualRedactions(ManualRedactions.builder()
.resizeRedactions(Set.of(ManualResizeRedaction.builder() .resizeRedactions(Set.of(ManualResizeRedaction.builder()
.updateDictionary(true) .updateDictionary(true)
.annotationId(david.getId()) .annotationId(david.getId())
.requestDate(OffsetDateTime.now()) .fileId(TEST_FILE_ID)
.value("David Ksenia") .user("user")
.positions(List.of(Rectangle.builder().topLeftX(56.8f).topLeftY(293.564f).width(65.592f).height(15.408f).page(1).build())) .requestDate(OffsetDateTime.now())
.addToAllDossiers(false) .value("David Ksenia")
.build())) .positions(List.of(Rectangle.builder()
.build()); .topLeftX(56.8f)
.topLeftY(293.564f)
.width(65.592f)
.height(15.408f)
.page(1)
.build()))
.addToAllDossiers(false)
.build()))
.build());
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); entityLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
var resizedEntity = entityLog.getEntityLogEntry().stream().filter(e -> e.getId().equals(david.getId())).findFirst().get(); var resizedEntity = entityLog.getEntityLogEntry()
.stream()
.filter(e -> e.getId().equals(david.getId()))
.findFirst()
.get();
assertEquals(resizedEntity.getState(), EntryState.APPLIED); assertEquals(resizedEntity.getState(), EntryState.APPLIED);
assertEquals(resizedEntity.getValue(), "David Ksenia"); assertEquals(resizedEntity.getValue(), "David");
assertEquals(0, resizedEntity.getManualChanges().size());
} }
@ -1367,8 +1572,10 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
return IdRemoval.builder() return IdRemoval.builder()
.annotationId(id) .annotationId(id)
.removeFromAllDossiers(false) .removeFromAllDossiers(false)
.fileId(TEST_FILE_ID)
.user("user")
.removeFromDictionary(false) .removeFromDictionary(false)
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.build(); .build();
} }
@ -1378,6 +1585,7 @@ public class RedactionIntegrationTest extends AbstractRedactionIntegrationTest {
ManualRedactionEntry manualRedactionEntry2 = new ManualRedactionEntry(); ManualRedactionEntry manualRedactionEntry2 = new ManualRedactionEntry();
manualRedactionEntry2.setAnnotationId(id); manualRedactionEntry2.setAnnotationId(id);
manualRedactionEntry2.setFileId("fileId"); manualRedactionEntry2.setFileId("fileId");
manualRedactionEntry2.setUser("test");
manualRedactionEntry2.setType("manual"); manualRedactionEntry2.setType("manual");
manualRedactionEntry2.setRectangle(false); manualRedactionEntry2.setRectangle(false);
manualRedactionEntry2.setRequestDate(OffsetDateTime.now()); manualRedactionEntry2.setRequestDate(OffsetDateTime.now());

View File

@ -0,0 +1,160 @@
package com.iqser.red.service.redaction.v1.server.document.graph;
import static com.iqser.red.service.redaction.v1.server.utils.EntityVisualizationUtility.ENTITY_LAYER;
import static org.junit.jupiter.api.Assertions.assertEquals;
import java.awt.Color;
import java.io.File;
import java.nio.file.Path;
import java.util.List;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.dossier.file.FileType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.NodeType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityEnrichmentService;
import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService;
import com.iqser.red.service.redaction.v1.server.utils.EntityVisualizationUtility;
import com.knecon.fforesight.service.viewerdoc.model.Visualizations;
import com.knecon.fforesight.service.viewerdoc.service.ViewerDocumentService;
import com.knecon.fforesight.tenantcommons.TenantContext;
import lombok.SneakyThrows;
public class TableTest extends BuildDocumentIntegrationTest {
private static final boolean DRAW_FILE = false;
@Autowired
private EntityEnrichmentService entityEnrichmentService;
private EntityCreationService entityCreationService;
private static final String TYPE_1 = "type1";
private static final String TYPE_2 = "type2";
private static final String TYPE_3 = "type3";
private static final String TYPE_4 = "type4";
private Table table;
private Set<TextEntity> entities;
@SneakyThrows
@BeforeEach
public void createTable() {
entityCreationService = new EntityCreationService(entityEnrichmentService);
String fileName = "files/Minimal Examples/BasicTable.pdf";
Document document = buildGraph(fileName);
table = (Table) document.streamAllSubNodesOfType(NodeType.TABLE)
.findAny()
.orElseThrow();
entities = List.of(//
entityCreationService.byString("Cell11", TYPE_1, EntityType.ENTITY, document),
entityCreationService.byString("Cell21", TYPE_1, EntityType.ENTITY, document),
entityCreationService.byString("Cell31", TYPE_1, EntityType.ENTITY, document),
entityCreationService.byString("Cell41", TYPE_1, EntityType.ENTITY, document),
entityCreationService.byString("Cell51", TYPE_1, EntityType.ENTITY, document),
entityCreationService.byString("Cell12", TYPE_2, EntityType.ENTITY, document),
entityCreationService.byString("Cell32", TYPE_2, EntityType.ENTITY, document),
entityCreationService.byString("Cell42", TYPE_2, EntityType.ENTITY, document),
entityCreationService.byString("Cell23", TYPE_3, EntityType.ENTITY, document),
entityCreationService.byString("Cell53", TYPE_3, EntityType.ENTITY, document),
entityCreationService.byString("Cell14", TYPE_4, EntityType.ENTITY, document),
entityCreationService.byString("Cell34", TYPE_4, EntityType.ENTITY, document))
.stream()
.flatMap(Function.identity())
.collect(Collectors.toSet());
if (DRAW_FILE) {
File file = new File("/tmp/" + Path.of(fileName).getFileName().toString());
storageService.downloadTo(TenantContext.getTenantId(),
RedactionStorageService.StorageIdUtils.getStorageId(TEST_DOSSIER_ID, TEST_FILE_ID, FileType.VIEWER_DOCUMENT),
file);
ViewerDocumentService viewerDocumentService = new ViewerDocumentService(null);
var visualizationsOnPage = EntityVisualizationUtility.createVisualizationsOnPage(document.getEntities(), Color.MAGENTA);
viewerDocumentService.addVisualizationsOnPage(file,
file,
Visualizations.builder()
.layer(ENTITY_LAYER)
.visualizationsOnPages(visualizationsOnPage)
.layerVisibilityDefaultValue(true)
.build());
}
}
@Test
public void testStreamEntitiesWhereRowContainsEntitiesOfType() {
int type_2_count = table.getEntitiesOfType(TYPE_2).size();
assertEquals(type_2_count,
table.streamEntitiesWhereRowContainsEntitiesOfType(List.of(TYPE_1))
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(type_2_count,
table.streamEntitiesWhereRowContainsEntitiesOfType(List.of(TYPE_1, TYPE_4))
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(2,
table.streamEntitiesWhereRowContainsEntitiesOfEachType(List.of(TYPE_1, TYPE_4))
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(0,
table.streamEntitiesWhereRowContainsEntitiesOfEachType(List.of(TYPE_1, TYPE_3))
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(0,
table.streamEntitiesWhereRowContainsEntitiesOfEachType(List.of(TYPE_1, TYPE_3, TYPE_4))
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(type_2_count,
table.streamEntitiesWhereRowContainsEntitiesOfEachType(List.of())
.filter(textEntity -> textEntity.type().equals(TYPE_2))
.count());
assertEquals(3,
table.streamTextEntitiesInRow(1)
.count());
assertEquals(2,
table.streamTextEntitiesInRow(4)
.count());
assertEquals(5,
table.streamTextEntitiesInCol(1)
.count());
assertEquals(3,
table.streamTextEntitiesInRow(3)
.count());
}
}

View File

@ -38,6 +38,7 @@ import com.iqser.red.commons.jackson.ObjectMapperFactory;
import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest; import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest;
import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeResult; import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeResult;
import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType; import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState; import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState;
@ -58,7 +59,6 @@ import com.iqser.red.service.redaction.v1.server.Application;
import com.iqser.red.service.redaction.v1.server.annotate.AnnotateRequest; import com.iqser.red.service.redaction.v1.server.annotate.AnnotateRequest;
import com.iqser.red.service.redaction.v1.server.annotate.AnnotateResponse; import com.iqser.red.service.redaction.v1.server.annotate.AnnotateResponse;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType; import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.redaction.utils.OsUtils; import com.iqser.red.service.redaction.v1.server.redaction.utils.OsUtils;
import com.iqser.red.service.redaction.v1.server.service.document.DocumentGraphMapper; import com.iqser.red.service.redaction.v1.server.service.document.DocumentGraphMapper;
@ -127,15 +127,15 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L); when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L);
when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, false)).thenReturn(List.of(Type.builder() when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, false)).thenReturn(List.of(Type.builder()
.id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID) .id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID)
.type(DOSSIER_REDACTIONS_INDICATOR) .type(DOSSIER_REDACTIONS_INDICATOR)
.dossierTemplateId(TEST_DOSSIER_ID) .dossierTemplateId(TEST_DOSSIER_ID)
.hexColor("#ffe187") .hexColor("#ffe187")
.isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.build())); .build()));
mockDictionaryCalls(null); mockDictionaryCalls(null);
@ -155,29 +155,40 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
String testEntityValue1 = "Desiree"; String testEntityValue1 = "Desiree";
String testEntityValue2 = "Melanie"; String testEntityValue2 = "Melanie";
EntityLog redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); EntityLog redactionLog = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertEquals(2, redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getValue().equals(testEntityValue1)).count()); assertEquals(2,
assertEquals(2, redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getValue().equals(testEntityValue2)).count()); redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue().equals(testEntityValue1))
.count());
assertEquals(2,
redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue().equals(testEntityValue2))
.count());
Document document = DocumentGraphMapper.toDocumentGraph(redactionStorageService.getDocumentData(TEST_DOSSIER_ID, TEST_FILE_ID)); Document document = DocumentGraphMapper.toDocumentGraph(redactionStorageService.getDocumentData(TEST_DOSSIER_ID, TEST_FILE_ID));
String expandedEntityKeyword = "Lorem ipsum dolor sit amet, consectetur adipiscing elit Desiree et al sed do eiusmod tempor incididunt ut labore et dolore magna aliqua Melanie et al. Reference No 12345 Lorem ipsum."; String expandedEntityKeyword = "Lorem ipsum dolor sit amet, consectetur adipiscing elit Desiree et al sed do eiusmod tempor incididunt ut labore et dolore magna aliqua Melanie et al. Reference No 12345 Lorem ipsum.";
entityCreationService.byString(expandedEntityKeyword, "PII", EntityType.ENTITY, document).findFirst().get(); entityCreationService.byString(expandedEntityKeyword, "PII", EntityType.ENTITY, document)
.findFirst()
.get();
String idToResize = redactionLog.getEntityLogEntry() String idToResize = redactionLog.getEntityLogEntry()
.stream() .stream()
.filter(entry -> entry.getValue().equals(testEntityValue1)) .filter(entry -> entry.getValue().equals(testEntityValue1))
.max(Comparator.comparingInt(EntityLogEntry::getStartOffset)) .max(Comparator.comparingInt(EntityLogEntry::getStartOffset))
.get() .get().getId();
.getId();
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
manualRedactions.getResizeRedactions().add(ManualResizeRedaction.builder() manualRedactions.getResizeRedactions()
.annotationId(idToResize) .add(ManualResizeRedaction.builder()
.value(expandedEntityKeyword) .annotationId(idToResize)
.positions(List.of(Rectangle.builder().topLeftX(56.8f).topLeftY(454.664f).height(15.408f).width(493.62f).page(3).build(), .fileId(TEST_FILE_ID)
Rectangle.builder().topLeftX(56.8f).topLeftY(440.864f).height(15.408f).width(396f).page(3).build())) .value(expandedEntityKeyword)
.addToAllDossiers(false) .positions(List.of(Rectangle.builder().topLeftX(56.8f).topLeftY(454.664f).height(15.408f).width(493.62f).page(3).build(),
.updateDictionary(false) Rectangle.builder().topLeftX(56.8f).topLeftY(440.864f).height(15.408f).width(396f).page(3).build()))
.requestDate(OffsetDateTime.now()) .addToAllDossiers(false)
.build()); .updateDictionary(false)
.requestDate(OffsetDateTime.now())
.build());
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
@ -188,21 +199,32 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
try (FileOutputStream fileOutputStream = new FileOutputStream(tmpFile)) { try (FileOutputStream fileOutputStream = new FileOutputStream(tmpFile)) {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
} }
EntityLogEntry resizedEntry = redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getValue().equals(expandedEntityKeyword)).findFirst().get(); EntityLogEntry resizedEntry = redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue().equals(expandedEntityKeyword))
.findFirst()
.get();
assertEquals(idToResize, resizedEntry.getId()); assertEquals(idToResize, resizedEntry.getId());
assertEquals(1, redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getValue().equals(testEntityValue1)).count());
assertEquals(1, assertEquals(1,
redactionLog.getEntityLogEntry().stream().filter(entry -> entry.getValue().equals(testEntityValue2) && !entry.getState().equals(EntryState.REMOVED)).count()); redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue().equals(testEntityValue1))
.count());
assertEquals(1,
redactionLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getValue().equals(testEntityValue2) && !entry.getState().equals(EntryState.REMOVED))
.count());
} }
private static com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle toAnnotationRectangle(Rectangle2D rectangle2D, int pageNumber) { private static com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle toAnnotationRectangle(Rectangle2D rectangle2D, int pageNumber) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle((float) rectangle2D.getMaxX(), return new com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle((float) rectangle2D.getMaxX(),
(float) rectangle2D.getMaxY() - (float) rectangle2D.getHeight(), (float) rectangle2D.getMaxY() - (float) rectangle2D.getHeight(),
(float) rectangle2D.getWidth(), (float) rectangle2D.getWidth(),
-(float) rectangle2D.getHeight(), -(float) rectangle2D.getHeight(),
pageNumber); pageNumber);
} }
@ -219,10 +241,10 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder().annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf").fileId("fileId").build())); manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder().annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf").fileId("fileId").build()));
manualRedactions.setForceRedactions(Set.of(ManualForceRedaction.builder() manualRedactions.setForceRedactions(Set.of(ManualForceRedaction.builder()
.annotationId("675eba69b0c2917de55462c817adaa05") .annotationId("675eba69b0c2917de55462c817adaa05")
.fileId("fileId") .fileId("fileId")
.legalBasis("Something") .legalBasis("Something")
.build())); .build()));
ManualRedactionEntry manualRedactionEntry = new ManualRedactionEntry(); ManualRedactionEntry manualRedactionEntry = new ManualRedactionEntry();
manualRedactionEntry.setAnnotationId(manualAddId); manualRedactionEntry.setAnnotationId(manualAddId);
@ -232,7 +254,7 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
manualRedactionEntry.setValue("O'Loughlin C.K."); manualRedactionEntry.setValue("O'Loughlin C.K.");
manualRedactionEntry.setReason("Manual Redaction"); manualRedactionEntry.setReason("Manual Redaction");
manualRedactionEntry.setPositions(List.of(Rectangle.builder().topLeftX(375.61096f).topLeftY(241.282f).width(7.648041f).height(43.72262f).page(1).build(), manualRedactionEntry.setPositions(List.of(Rectangle.builder().topLeftX(375.61096f).topLeftY(241.282f).width(7.648041f).height(43.72262f).page(1).build(),
Rectangle.builder().topLeftX(384.83517f).topLeftY(241.282f).width(7.648041f).height(17.043358f).page(1).build())); Rectangle.builder().topLeftX(384.83517f).topLeftY(241.282f).width(7.648041f).height(17.043358f).page(1).build()));
AnalyzeRequest request = uploadFileToStorage(pdfFile); AnalyzeRequest request = uploadFileToStorage(pdfFile);
request.setManualRedactions(manualRedactions); request.setManualRedactions(manualRedactions);
@ -242,11 +264,11 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder().annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf").fileId("fileId").build())); manualRedactions.setIdsToRemove(Set.of(IdRemoval.builder().annotationId("5b940b2cb401ed9f5be6fc24f6e77bcf").fileId("fileId").build()));
manualRedactions.setLegalBasisChanges((Set.of(ManualLegalBasisChange.builder() manualRedactions.setLegalBasisChanges((Set.of(ManualLegalBasisChange.builder()
.annotationId("675eba69b0c2917de55462c817adaa05") .annotationId("675eba69b0c2917de55462c817adaa05")
.fileId("fileId") .fileId("fileId")
.legalBasis("Manual Legal Basis Change") .legalBasis("Manual Legal Basis Change")
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.build()))); .build())));
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
@ -295,7 +317,10 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
fileOutputStream.write(annotateResponse.getDocument()); fileOutputStream.write(annotateResponse.getDocument());
} }
long end = System.currentTimeMillis(); long end = System.currentTimeMillis();
var optionalEntry = redactionLog.getEntityLogEntry().stream().filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId)).findAny(); var optionalEntry = redactionLog.getEntityLogEntry()
.stream()
.filter(entityLogEntry -> entityLogEntry.getId().equals(manualAddId))
.findAny();
assertTrue(optionalEntry.isPresent()); assertTrue(optionalEntry.isPresent());
assertEquals(2, optionalEntry.get().getContainingNodeId().size()); // 2 is the depth of the table instead of the table cell assertEquals(2, optionalEntry.get().getContainingNodeId().size()); // 2 is the depth of the table instead of the table cell
System.out.println("duration: " + (end - start)); System.out.println("duration: " + (end - start));
@ -318,6 +343,7 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
.filter(entry -> entry.getValue().equals("Oxford University Press")) .filter(entry -> entry.getValue().equals("Oxford University Press"))
.findFirst() .findFirst()
.get(); .get();
assertFalse(oxfordUniversityPress.getEngines().contains(Engine.MANUAL));
var asyaLyon = redactionLog.getEntityLogEntry() var asyaLyon = redactionLog.getEntityLogEntry()
.stream() .stream()
@ -344,9 +370,9 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
EntityLog redactionLog2 = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID); EntityLog redactionLog2 = redactionStorageService.getEntityLog(TEST_DOSSIER_ID, TEST_FILE_ID);
assertFalse(redactionLog2.getEntityLogEntry() assertFalse(redactionLog2.getEntityLogEntry()
.stream() .stream()
.filter(entry -> entry.getType().equals("published_information")) .filter(entry -> entry.getType().equals("published_information"))
.anyMatch(entry -> entry.getValue().equals("Oxford University Press"))); .anyMatch(entry -> entry.getValue().equals("Oxford University Press")));
var oxfordUniversityPressRecategorized = redactionLog2.getEntityLogEntry() var oxfordUniversityPressRecategorized = redactionLog2.getEntityLogEntry()
.stream() .stream()
@ -364,6 +390,7 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
assertEquals(asyaLyon2.getState(), EntryState.APPLIED); assertEquals(asyaLyon2.getState(), EntryState.APPLIED);
assertEquals(1, oxfordUniversityPressRecategorized.getManualChanges().size()); assertEquals(1, oxfordUniversityPressRecategorized.getManualChanges().size());
assertTrue(oxfordUniversityPressRecategorized.getEngines().contains(Engine.MANUAL));
} }
@ -379,15 +406,15 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
String annotationId = "testAnnotationId"; String annotationId = "testAnnotationId";
manualRedactions.setEntriesToAdd(Set.of(ManualRedactionEntry.builder() manualRedactions.setEntriesToAdd(Set.of(ManualRedactionEntry.builder()
.annotationId(annotationId) .annotationId(annotationId)
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.type("manual") .type("manual")
.value("Expand to Hint Clarissas Donut ← not added to Dict, should be not annotated Simpson's Tower ← added to Authors-Dict, should be annotated") .value("Expand to Hint Clarissas Donut ← not added to Dict, should be not annotated Simpson's Tower ← added to Authors-Dict, should be annotated")
.positions(List.of(// .positions(List.of(//
new Rectangle(new Point(56.8f, 496.27f), 61.25f, 12.83f, 2), // new Rectangle(new Point(56.8f, 496.27f), 61.25f, 12.83f, 2), //
new Rectangle(new Point(56.8f, 482.26f), 303.804f, 15.408f, 2), // new Rectangle(new Point(56.8f, 482.26f), 303.804f, 15.408f, 2), //
new Rectangle(new Point(56.8f, 468.464f), 314.496f, 15.408f, 2))) // new Rectangle(new Point(56.8f, 468.464f), 314.496f, 15.408f, 2))) //
.build())); .build()));
ManualResizeRedaction manualResizeRedaction = ManualResizeRedaction.builder() ManualResizeRedaction manualResizeRedaction = ManualResizeRedaction.builder()
.annotationId(annotationId) .annotationId(annotationId)
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
@ -401,10 +428,58 @@ public class ManualChangesEnd2EndTest extends AbstractRedactionIntegrationTest {
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
EntityLog entityLog = redactionStorageService.getEntityLog(request.getDossierId(), request.getFileId()); EntityLog entityLog = redactionStorageService.getEntityLog(request.getDossierId(), request.getFileId());
EntityLogEntry entityLogEntry = entityLog.getEntityLogEntry().stream().filter(entry -> entry.getId().equals(annotationId)).findFirst().orElseThrow(); EntityLogEntry entityLogEntry = entityLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getId().equals(annotationId))
.findFirst()
.orElseThrow();
assertEquals("Expand to Hint", entityLogEntry.getValue()); assertEquals("Expand to Hint", entityLogEntry.getValue());
assertEquals(1, entityLogEntry.getPositions().size()); assertEquals(1, entityLogEntry.getPositions().size());
assertEquals(ManualRedactionType.RESIZE, entityLogEntry.getManualChanges().get(entityLogEntry.getManualChanges().size() - 1).getManualRedactionType()); assertEquals(ManualRedactionType.RESIZE,
entityLogEntry.getManualChanges()
.get(entityLogEntry.getManualChanges().size() - 1).getManualRedactionType());
assertTrue(entityLogEntry.getEngines().contains(Engine.MANUAL));
}
@Test
@SneakyThrows
public void testAddEngineManualToResizeDictionaryEntry() {
String filePath = "files/new/crafted document.pdf";
AnalyzeRequest request = uploadFileToStorage(filePath);
analyzeDocumentStructure(LayoutParsingType.REDACT_MANAGER, request);
AnalyzeResult result = analyzeService.analyze(request);
ManualRedactions manualRedactions = new ManualRedactions();
EntityLog entityLog = redactionStorageService.getEntityLog(request.getDossierId(), request.getFileId());
var dictionaryEntry = entityLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.isDictionaryEntry() || entry.isDossierDictionaryEntry())
.findFirst()
.get();
ManualResizeRedaction manualResizeRedaction = ManualResizeRedaction.builder()
.annotationId(dictionaryEntry.getId())
.requestDate(OffsetDateTime.now())
.value("Image")
.positions(List.of(new Rectangle(new Point(56.8f, 496.27f), 61.25f, 12.83f, 1)))
.updateDictionary(true)
.build();
manualRedactions.setResizeRedactions(Set.of(manualResizeRedaction));
request.setManualRedactions(manualRedactions);
analyzeService.reanalyze(request);
entityLog = redactionStorageService.getEntityLog(request.getDossierId(), request.getFileId());
EntityLogEntry entityLogEntry = entityLog.getEntityLogEntry()
.stream()
.filter(entry -> entry.getId().equals(dictionaryEntry.getId()))
.findFirst()
.orElseThrow();
assertEquals(ManualRedactionType.RESIZE_IN_DICTIONARY,
entityLogEntry.getManualChanges()
.get(entityLogEntry.getManualChanges().size() - 1).getManualRedactionType());
assertTrue(entityLogEntry.getEngines().contains(Engine.MANUAL));
} }
} }

View File

@ -32,18 +32,33 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
public void manualResizeRedactionTest() { public void manualResizeRedactionTest() {
Document document = buildGraph("files/new/crafted document"); Document document = buildGraph("files/new/crafted document");
Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document).collect(Collectors.toUnmodifiableSet()); Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document)
.collect(Collectors.toUnmodifiableSet());
Set<TextEntity> biggerEntities = entityCreationService.byString("David Ksenia Max Mustermann", "CBI_author", EntityType.ENTITY, document) Set<TextEntity> biggerEntities = entityCreationService.byString("David Ksenia Max Mustermann", "CBI_author", EntityType.ENTITY, document)
.collect(Collectors.toUnmodifiableSet()); .collect(Collectors.toUnmodifiableSet());
TextEntity entity = entities.stream().filter(e -> e.getPages().stream().anyMatch(p -> p.getNumber() == 1)).findFirst().get(); TextEntity entity = entities.stream()
TextEntity biggerEntity = biggerEntities.stream().filter(e -> e.getPages().stream().anyMatch(p -> p.getNumber() == 1)).findFirst().get(); .filter(e -> e.getPages()
.stream()
.anyMatch(p -> p.getNumber() == 1))
.findFirst()
.get();
TextEntity biggerEntity = biggerEntities.stream()
.filter(e -> e.getPages()
.stream()
.anyMatch(p -> p.getNumber() == 1))
.findFirst()
.get();
String initialId = entity.getPositionsOnPagePerPage().get(0).getId(); String initialId = entity.getPositionsOnPagePerPage()
.get(0).getId();
ManualResizeRedaction manualResizeRedaction = ManualResizeRedaction.builder() ManualResizeRedaction manualResizeRedaction = ManualResizeRedaction.builder()
.annotationId(initialId) .annotationId(initialId)
.fileId(TEST_FILE_ID)
.user("user")
.value(biggerEntity.getValue()) .value(biggerEntity.getValue())
.positions(toAnnotationRectangles(biggerEntity.getPositionsOnPagePerPage().get(0))) .positions(toAnnotationRectangles(biggerEntity.getPositionsOnPagePerPage()
.get(0)))
.requestDate(OffsetDateTime.now()) .requestDate(OffsetDateTime.now())
.updateDictionary(false) .updateDictionary(false)
.build(); .build();
@ -55,8 +70,13 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
assertTrue(Sets.difference(new HashSet<>(biggerEntity.getIntersectingNodes()), new HashSet<>(entity.getIntersectingNodes())).isEmpty()); assertTrue(Sets.difference(new HashSet<>(biggerEntity.getIntersectingNodes()), new HashSet<>(entity.getIntersectingNodes())).isEmpty());
assertEquals(biggerEntity.getPages(), entity.getPages()); assertEquals(biggerEntity.getPages(), entity.getPages());
assertEquals(biggerEntity.getValue(), entity.getValue()); assertEquals(biggerEntity.getValue(), entity.getValue());
assertEquals(initialId, entity.getPositionsOnPagePerPage().get(0).getId()); assertEquals(initialId,
assertRectanglesAlmostEqual(biggerEntity.getPositionsOnPagePerPage().get(0).getRectanglePerLine(), entity.getPositionsOnPagePerPage().get(0).getRectanglePerLine()); entity.getPositionsOnPagePerPage()
.get(0).getId());
assertRectanglesAlmostEqual(biggerEntity.getPositionsOnPagePerPage()
.get(0).getRectanglePerLine(),
entity.getPositionsOnPagePerPage()
.get(0).getRectanglePerLine());
assertTrue(entity.resized()); assertTrue(entity.resized());
} }
@ -65,12 +85,25 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
public void manualForceRedactionTest() { public void manualForceRedactionTest() {
Document document = buildGraph("files/new/crafted document"); Document document = buildGraph("files/new/crafted document");
Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document).collect(Collectors.toUnmodifiableSet()); Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document)
.collect(Collectors.toUnmodifiableSet());
TextEntity entity = entities.stream().filter(e -> e.getPages().stream().anyMatch(p -> p.getNumber() == 1)).findFirst().get(); TextEntity entity = entities.stream()
.filter(e -> e.getPages()
.stream()
.anyMatch(p -> p.getNumber() == 1))
.findFirst()
.get();
String initialId = entity.getPositionsOnPagePerPage().get(0).getId(); String initialId = entity.getPositionsOnPagePerPage()
ManualForceRedaction manualForceRedaction = ManualForceRedaction.builder().annotationId(initialId).legalBasis("Something").requestDate(OffsetDateTime.now()).build(); .get(0).getId();
ManualForceRedaction manualForceRedaction = ManualForceRedaction.builder()
.annotationId(initialId)
.fileId(TEST_FILE_ID)
.user("user")
.legalBasis("Something")
.requestDate(OffsetDateTime.now())
.build();
doAnalysis(document, List.of(manualForceRedaction)); doAnalysis(document, List.of(manualForceRedaction));
@ -78,8 +111,12 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
assertFalse(entity.getIntersectingNodes().isEmpty()); assertFalse(entity.getIntersectingNodes().isEmpty());
assertEquals(1, entity.getPages().size()); assertEquals(1, entity.getPages().size());
assertEquals("David Ksenia", entity.getValue()); assertEquals("David Ksenia", entity.getValue());
assertEquals("Something", entity.getManualOverwrite().getLegalBasis().orElse(entity.getMatchedRule().getLegalBasis())); assertEquals("Something",
assertEquals(initialId, entity.getPositionsOnPagePerPage().get(0).getId()); entity.getManualOverwrite().getLegalBasis()
.orElse(entity.getMatchedRule().getLegalBasis()));
assertEquals(initialId,
entity.getPositionsOnPagePerPage()
.get(0).getId());
assertFalse(entity.removed()); assertFalse(entity.removed());
assertTrue(entity.hasManualChanges()); assertTrue(entity.hasManualChanges());
assertTrue(entity.applied()); assertTrue(entity.applied());
@ -90,17 +127,26 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
public void manualIDRemovalTest() { public void manualIDRemovalTest() {
Document document = buildGraph("files/new/crafted document"); Document document = buildGraph("files/new/crafted document");
Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document).collect(Collectors.toUnmodifiableSet()); Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document)
.collect(Collectors.toUnmodifiableSet());
TextEntity entity = entities.stream().filter(e -> e.getPages().stream().anyMatch(p -> p.getNumber() == 1)).findFirst().get(); TextEntity entity = entities.stream()
.filter(e -> e.getPages()
.stream()
.anyMatch(p -> p.getNumber() == 1))
.findFirst()
.get();
String initialId = entity.getPositionsOnPagePerPage().get(0).getId(); String initialId = entity.getPositionsOnPagePerPage()
IdRemoval idRemoval = IdRemoval.builder().annotationId(initialId).requestDate(OffsetDateTime.now()).build(); .get(0).getId();
IdRemoval idRemoval = IdRemoval.builder().annotationId(initialId).requestDate(OffsetDateTime.now()).fileId(TEST_FILE_ID).user("user").build();
doAnalysis(document, List.of(idRemoval)); doAnalysis(document, List.of(idRemoval));
assertEquals("David Ksenia", entity.getValue()); assertEquals("David Ksenia", entity.getValue());
assertEquals(initialId, entity.getPositionsOnPagePerPage().get(0).getId()); assertEquals(initialId,
entity.getPositionsOnPagePerPage()
.get(0).getId());
assertTrue(entity.ignored()); assertTrue(entity.ignored());
} }
@ -109,13 +155,25 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
public void manualIDRemovalButAlsoForceRedactionTest() { public void manualIDRemovalButAlsoForceRedactionTest() {
Document document = buildGraph("files/new/crafted document"); Document document = buildGraph("files/new/crafted document");
Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document).collect(Collectors.toUnmodifiableSet()); Set<TextEntity> entities = entityCreationService.byString("David Ksenia", "CBI_author", EntityType.ENTITY, document)
.collect(Collectors.toUnmodifiableSet());
TextEntity entity = entities.stream().filter(e -> e.getPages().stream().anyMatch(p -> p.getNumber() == 1)).findFirst().get(); TextEntity entity = entities.stream()
.filter(e -> e.getPages()
.stream()
.anyMatch(p -> p.getNumber() == 1))
.findFirst()
.get();
String initialId = entity.getPositionsOnPagePerPage().get(0).getId(); String initialId = entity.getPositionsOnPagePerPage()
IdRemoval idRemoval = IdRemoval.builder().annotationId(initialId).requestDate(OffsetDateTime.now()).build(); .get(0).getId();
ManualForceRedaction manualForceRedaction = ManualForceRedaction.builder().annotationId(initialId).legalBasis("Something").requestDate(OffsetDateTime.now()).build(); ManualForceRedaction manualForceRedaction = ManualForceRedaction.builder()
.annotationId(initialId)
.legalBasis("Something")
.requestDate(OffsetDateTime.now())
.fileId(TEST_FILE_ID)
.user("user")
.build();
doAnalysis(document, List.of(manualForceRedaction)); doAnalysis(document, List.of(manualForceRedaction));
@ -123,7 +181,9 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
assertFalse(entity.getIntersectingNodes().isEmpty()); assertFalse(entity.getIntersectingNodes().isEmpty());
assertEquals(1, entity.getPages().size()); assertEquals(1, entity.getPages().size());
assertEquals("David Ksenia", entity.getValue()); assertEquals("David Ksenia", entity.getValue());
assertEquals(initialId, entity.getPositionsOnPagePerPage().get(0).getId()); assertEquals(initialId,
entity.getPositionsOnPagePerPage()
.get(0).getId());
assertFalse(entity.removed()); assertFalse(entity.removed());
assertFalse(entity.ignored()); assertFalse(entity.ignored());
} }
@ -131,7 +191,9 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
private void assertRectanglesAlmostEqual(Collection<Rectangle2D> rects1, Collection<Rectangle2D> rects2) { private void assertRectanglesAlmostEqual(Collection<Rectangle2D> rects1, Collection<Rectangle2D> rects2) {
if (rects1.stream().allMatch(rect1 -> rects2.stream().anyMatch(rect2 -> rectanglesAlmostEqual(rect1, rect2)))) { if (rects1.stream()
.allMatch(rect1 -> rects2.stream()
.anyMatch(rect2 -> rectanglesAlmostEqual(rect1, rect2)))) {
return; return;
} }
// use this for nice formatting of error message // use this for nice formatting of error message
@ -143,15 +205,18 @@ public class ManualChangesIntegrationTest extends RulesIntegrationTest {
double tolerance = 1e-1; double tolerance = 1e-1;
return Math.abs(r1.getX() - r2.getX()) < tolerance &&// return Math.abs(r1.getX() - r2.getX()) < tolerance &&//
Math.abs(r1.getY() - r2.getY()) < tolerance &&// Math.abs(r1.getY() - r2.getY()) < tolerance &&//
Math.abs(r1.getWidth() - r2.getWidth()) < tolerance &&// Math.abs(r1.getWidth() - r2.getWidth()) < tolerance &&//
Math.abs(r1.getHeight() - r2.getHeight()) < tolerance; Math.abs(r1.getHeight() - r2.getHeight()) < tolerance;
} }
private static List<Rectangle> toAnnotationRectangles(PositionOnPage positionsOnPage) { private static List<Rectangle> toAnnotationRectangles(PositionOnPage positionsOnPage) {
return positionsOnPage.getRectanglePerLine().stream().map(rectangle2D -> toAnnotationRectangle(rectangle2D, positionsOnPage.getPage().getNumber())).toList(); return positionsOnPage.getRectanglePerLine()
.stream()
.map(rectangle2D -> toAnnotationRectangle(rectangle2D, positionsOnPage.getPage().getNumber()))
.toList();
} }

View File

@ -43,7 +43,9 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
OffsetDateTime start = OffsetDateTime.now(); OffsetDateTime start = OffsetDateTime.now();
String reason = "whatever"; String reason = "whatever";
Document document = buildGraphNoImages("files/new/crafted document.pdf"); Document document = buildGraphNoImages("files/new/crafted document.pdf");
List<TextEntity> entities = entityCreationService.byString("David Ksenia", "test", EntityType.ENTITY, document).peek(e -> e.apply("T.0.0", reason)).toList(); List<TextEntity> entities = entityCreationService.byString("David Ksenia", "test", EntityType.ENTITY, document)
.peek(e -> e.apply("T.0.0", reason))
.toList();
assertFalse(entities.isEmpty()); assertFalse(entities.isEmpty());
TextEntity entity = entities.get(0); TextEntity entity = entities.get(0);
assertTrue(entity.active()); assertTrue(entity.active());
@ -52,10 +54,11 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertFalse(entity.resized()); assertFalse(entity.resized());
assertFalse(entity.ignored()); assertFalse(entity.ignored());
assertEquals("n-a", entity.getMatchedRule().getLegalBasis()); assertEquals("n-a", entity.getMatchedRule().getLegalBasis());
String annotationId = entity.getPositionsOnPagePerPage().get(0).getId(); String annotationId = entity.getPositionsOnPagePerPage()
.get(0).getId();
// remove first // remove first
IdRemoval removal = IdRemoval.builder().requestDate(start).fileId(TEST_FILE_ID).annotationId(annotationId).build(); IdRemoval removal = IdRemoval.builder().requestDate(start).fileId(TEST_FILE_ID).user("user").annotationId(annotationId).build();
entity.getManualOverwrite().addChange(removal); entity.getManualOverwrite().addChange(removal);
assertTrue(entity.ignored()); assertTrue(entity.ignored());
assertFalse(entity.applied()); assertFalse(entity.applied());
@ -65,6 +68,7 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
ManualForceRedaction forceRedaction = ManualForceRedaction.builder() ManualForceRedaction forceRedaction = ManualForceRedaction.builder()
.requestDate(start.plusSeconds(1)) .requestDate(start.plusSeconds(1))
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.user("user")
.annotationId(annotationId) .annotationId(annotationId)
.legalBasis("coolio") .legalBasis("coolio")
.build(); .build();
@ -73,10 +77,12 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertFalse(entity.ignored()); assertFalse(entity.ignored());
assertFalse(entity.removed()); assertFalse(entity.removed());
assertEquals(reason + ", removed by manual override, forced by manual override", entity.buildReasonWithManualChangeDescriptions()); assertEquals(reason + ", removed by manual override, forced by manual override", entity.buildReasonWithManualChangeDescriptions());
assertEquals("coolio", entity.getManualOverwrite().getLegalBasis().orElse(entity.getMatchedRule().getLegalBasis())); assertEquals("coolio",
entity.getManualOverwrite().getLegalBasis()
.orElse(entity.getMatchedRule().getLegalBasis()));
// remove again // remove again
IdRemoval removal2 = IdRemoval.builder().requestDate(start.plusSeconds(3)).fileId(TEST_FILE_ID).annotationId(annotationId).build(); IdRemoval removal2 = IdRemoval.builder().requestDate(start.plusSeconds(3)).fileId(TEST_FILE_ID).annotationId(annotationId).user("user").build();
entity.getManualOverwrite().addChange(removal2); entity.getManualOverwrite().addChange(removal2);
assertTrue(entity.ignored()); assertTrue(entity.ignored());
assertFalse(entity.applied()); assertFalse(entity.applied());
@ -86,6 +92,7 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
ManualForceRedaction forceRedaction2 = ManualForceRedaction.builder() ManualForceRedaction forceRedaction2 = ManualForceRedaction.builder()
.requestDate(start.plusSeconds(2)) .requestDate(start.plusSeconds(2))
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.user("user")
.annotationId(annotationId) .annotationId(annotationId)
.legalBasis("coolio") .legalBasis("coolio")
.build(); .build();
@ -93,7 +100,7 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertTrue(entity.ignored()); assertTrue(entity.ignored());
assertFalse(entity.applied()); assertFalse(entity.applied());
assertEquals(reason + ", removed by manual override, forced by manual override, forced by manual override, removed by manual override", assertEquals(reason + ", removed by manual override, forced by manual override, forced by manual override, removed by manual override",
entity.buildReasonWithManualChangeDescriptions()); entity.buildReasonWithManualChangeDescriptions());
String legalBasis = "Yeah"; String legalBasis = "Yeah";
String section = "Some random section!"; String section = "Some random section!";
@ -103,6 +110,7 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
.annotationId(annotationId) .annotationId(annotationId)
.requestDate(start.plusSeconds(4)) .requestDate(start.plusSeconds(4))
.section(section) .section(section)
.fileId(TEST_FILE_ID)
.user("peter") .user("peter")
.value(value) .value(value)
.build(); .build();
@ -110,16 +118,32 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertTrue(entity.ignored()); assertTrue(entity.ignored());
assertFalse(entity.applied()); assertFalse(entity.applied());
assertEquals(reason + ", removed by manual override, forced by manual override, forced by manual override, removed by manual override, legal basis was manually changed", assertEquals(reason + ", removed by manual override, forced by manual override, forced by manual override, removed by manual override, legal basis was manually changed",
entity.buildReasonWithManualChangeDescriptions()); entity.buildReasonWithManualChangeDescriptions());
assertEquals(value, entity.getManualOverwrite().getValue().orElse(entity.getValue())); assertEquals(value,
assertEquals(legalBasis, entity.getManualOverwrite().getLegalBasis().orElse(entity.getMatchedRule().getLegalBasis())); entity.getManualOverwrite().getValue()
assertEquals(section, entity.getManualOverwrite().getSection().orElse(entity.getDeepestFullyContainingNode().toString())); .orElse(entity.getValue()));
assertEquals(legalBasis,
entity.getManualOverwrite().getLegalBasis()
.orElse(entity.getMatchedRule().getLegalBasis()));
assertEquals(section,
entity.getManualOverwrite().getSection()
.orElse(entity.getDeepestFullyContainingNode().toString()));
ManualRecategorization imageRecategorizationRequest = ManualRecategorization.builder().type("type").requestDate(start.plusSeconds(5)).annotationId(annotationId).build(); ManualRecategorization imageRecategorizationRequest = ManualRecategorization.builder()
.type("type")
.requestDate(start.plusSeconds(5))
.fileId(TEST_FILE_ID)
.user("user")
.annotationId(annotationId)
.build();
entity.getManualOverwrite().addChange(imageRecategorizationRequest); entity.getManualOverwrite().addChange(imageRecategorizationRequest);
assertTrue(entity.getManualOverwrite().getRecategorized().isPresent()); assertTrue(entity.getManualOverwrite().getRecategorized()
assertTrue(entity.getManualOverwrite().getRecategorized().get()); .isPresent());
assertEquals("type", entity.getManualOverwrite().getType().orElse(entity.type())); assertTrue(entity.getManualOverwrite().getRecategorized()
.get());
assertEquals("type",
entity.getManualOverwrite().getType()
.orElse(entity.type()));
} }
@ -129,7 +153,9 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
OffsetDateTime start = OffsetDateTime.now(); OffsetDateTime start = OffsetDateTime.now();
String reason = "whatever"; String reason = "whatever";
Document document = buildGraphNoImages("files/new/crafted document.pdf"); Document document = buildGraphNoImages("files/new/crafted document.pdf");
List<TextEntity> entities = entityCreationService.byString("David Ksenia", "test", EntityType.HINT, document).peek(e -> e.apply("T.0.0", reason)).toList(); List<TextEntity> entities = entityCreationService.byString("David Ksenia", "test", EntityType.HINT, document)
.peek(e -> e.apply("T.0.0", reason))
.toList();
assertFalse(entities.isEmpty()); assertFalse(entities.isEmpty());
TextEntity entity = entities.get(0); TextEntity entity = entities.get(0);
assertTrue(entity.active()); assertTrue(entity.active());
@ -138,10 +164,11 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertFalse(entity.resized()); assertFalse(entity.resized());
assertFalse(entity.ignored()); assertFalse(entity.ignored());
assertEquals("n-a", entity.getMatchedRule().getLegalBasis()); assertEquals("n-a", entity.getMatchedRule().getLegalBasis());
String annotationId = entity.getPositionsOnPagePerPage().get(0).getId(); String annotationId = entity.getPositionsOnPagePerPage()
.get(0).getId();
// remove first // remove first
IdRemoval removal = IdRemoval.builder().requestDate(start).fileId(TEST_FILE_ID).annotationId(annotationId).build(); IdRemoval removal = IdRemoval.builder().requestDate(start).fileId(TEST_FILE_ID).annotationId(annotationId).user("user").build();
entity.getManualOverwrite().addChange(removal); entity.getManualOverwrite().addChange(removal);
assertTrue(entity.ignored()); assertTrue(entity.ignored());
assertFalse(entity.applied()); assertFalse(entity.applied());
@ -152,6 +179,7 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
.requestDate(start.plusSeconds(1)) .requestDate(start.plusSeconds(1))
.fileId(TEST_FILE_ID) .fileId(TEST_FILE_ID)
.annotationId(annotationId) .annotationId(annotationId)
.user("user")
.legalBasis("coolio") .legalBasis("coolio")
.build(); .build();
entity.getManualOverwrite().addChange(forceRedaction); entity.getManualOverwrite().addChange(forceRedaction);
@ -159,7 +187,9 @@ public class ManualChangesUnitTest extends BuildDocumentIntegrationTest {
assertFalse(entity.ignored()); assertFalse(entity.ignored());
assertFalse(entity.removed()); assertFalse(entity.removed());
assertEquals(reason + ", removed by manual override, forced by manual override", entity.buildReasonWithManualChangeDescriptions()); assertEquals(reason + ", removed by manual override, forced by manual override", entity.buildReasonWithManualChangeDescriptions());
assertEquals("coolio", entity.getManualOverwrite().getLegalBasis().orElse(entity.getMatchedRule().getLegalBasis())); assertEquals("coolio",
entity.getManualOverwrite().getLegalBasis()
.orElse(entity.getMatchedRule().getLegalBasis()));
} }

View File

@ -84,7 +84,7 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
public void testFoundManualAddRedactionAndRemovedHasStateRemoved() { public void testFoundManualAddRedactionAndRemovedHasStateRemoved() {
DocumentAndEntity context = createFoundManualRedaction(); DocumentAndEntity context = createFoundManualRedaction();
IdRemoval removal = IdRemoval.builder().requestDate(OffsetDateTime.now()).build(); IdRemoval removal = IdRemoval.builder().annotationId("123").user("user").fileId(TEST_FILE_ID).requestDate(OffsetDateTime.now()).build();
context.entity().getManualOverwrite().addChange(removal); context.entity().getManualOverwrite().addChange(removal);
assertTrue(context.entity().removed()); assertTrue(context.entity().removed());
} }
@ -95,7 +95,7 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
public void testNotFoundManualAddRedactionAndRemovedHasStateRemoved() { public void testNotFoundManualAddRedactionAndRemovedHasStateRemoved() {
DocumentAndEntity context = createNotFoundManualRedaction(); DocumentAndEntity context = createNotFoundManualRedaction();
IdRemoval removal = IdRemoval.builder().requestDate(OffsetDateTime.now()).build(); IdRemoval removal = IdRemoval.builder().fileId(TEST_FILE_ID).user("user").annotationId("123").requestDate(OffsetDateTime.now()).build();
context.entity().getManualOverwrite().addChange(removal); context.entity().getManualOverwrite().addChange(removal);
assertTrue(context.entity().removed()); assertTrue(context.entity().removed());
} }
@ -108,8 +108,11 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
String value = "To: Syngenta Ltd. Jealotts Hill"; String value = "To: Syngenta Ltd. Jealotts Hill";
String type = DICTIONARY_AUTHOR; String type = DICTIONARY_AUTHOR;
ManualRedactionEntry manualRedactionEntry = ManualRedactionEntry.builder() ManualRedactionEntry manualRedactionEntry = ManualRedactionEntry.builder()
.annotationId("123")
.type(type) .type(type)
.value(value) .value(value)
.user("user")
.fileId(TEST_FILE_ID)
.reason("reason") .reason("reason")
.legalBasis("n-a") .legalBasis("n-a")
.section("n-a") .section("n-a")
@ -122,17 +125,20 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
assertTrue(document.getEntities().isEmpty()); assertTrue(document.getEntities().isEmpty());
List<PrecursorEntity> notFoundManualEntities = entityFromPrecursorCreationService.createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).build(), List<PrecursorEntity> notFoundManualEntities = entityFromPrecursorCreationService.createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions.builder()
document, .entriesToAdd(Set.of(
TEST_DOSSIER_TEMPLATE_ID); manualRedactionEntry))
.build(),
document,
TEST_DOSSIER_TEMPLATE_ID);
assertEquals(1, notFoundManualEntities.size()); assertEquals(1, notFoundManualEntities.size());
assertTrue(document.getEntities().isEmpty()); assertTrue(document.getEntities().isEmpty());
List<EntityLogEntry> redactionLogEntries = entityLogCreatorService.createInitialEntityLog(new AnalyzeRequest(), List<EntityLogEntry> redactionLogEntries = entityLogCreatorService.createInitialEntityLog(new AnalyzeRequest(),
document, document,
notFoundManualEntities, notFoundManualEntities,
new DictionaryVersion(), new DictionaryVersion(),
0L).getEntityLogEntry(); 0L).getEntityLogEntry();
assertEquals(1, redactionLogEntries.size()); assertEquals(1, redactionLogEntries.size());
assertEquals(value, redactionLogEntries.get(0).getValue()); assertEquals(value, redactionLogEntries.get(0).getValue());
@ -146,7 +152,8 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
Document document = buildGraph("files/new/VV-919901.pdf"); Document document = buildGraph("files/new/VV-919901.pdf");
EntityCreationService entityCreationService = new EntityCreationService(entityEnrichmentService); EntityCreationService entityCreationService = new EntityCreationService(entityEnrichmentService);
List<TextEntity> tempEntities = entityCreationService.byString("To: Syngenta Ltd.", "temp", EntityType.ENTITY, document).toList(); List<TextEntity> tempEntities = entityCreationService.byString("To: Syngenta Ltd.", "temp", EntityType.ENTITY, document)
.toList();
assertFalse(tempEntities.isEmpty()); assertFalse(tempEntities.isEmpty());
var tempEntity = tempEntities.get(0); var tempEntity = tempEntities.get(0);
List<Rectangle> positions = tempEntity.getPositionsOnPagePerPage() List<Rectangle> positions = tempEntity.getPositionsOnPagePerPage()
@ -158,8 +165,11 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
ManualRedactionEntry manualRedactionEntry = ManualRedactionEntry.builder() ManualRedactionEntry manualRedactionEntry = ManualRedactionEntry.builder()
.type("manual") .type("manual")
.annotationId("123")
.value(tempEntity.getValue()) .value(tempEntity.getValue())
.reason("reason") .reason("reason")
.user("user")
.fileId(TEST_FILE_ID)
.legalBasis("n-a") .legalBasis("n-a")
.section(tempEntity.getDeepestFullyContainingNode().toString()) .section(tempEntity.getDeepestFullyContainingNode().toString())
.rectangle(false) .rectangle(false)
@ -172,21 +182,28 @@ public class PrecursorEntityTest extends BuildDocumentIntegrationTest {
tempEntity.removeFromGraph(); tempEntity.removeFromGraph();
assertTrue(document.getEntities().isEmpty()); assertTrue(document.getEntities().isEmpty());
List<PrecursorEntity> notFoundManualEntities = entityFromPrecursorCreationService.createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions.builder().entriesToAdd(Set.of(manualRedactionEntry)).build(), List<PrecursorEntity> notFoundManualEntities = entityFromPrecursorCreationService.createEntitiesIfFoundAndReturnNotFoundEntries(ManualRedactions.builder()
document, .entriesToAdd(Set.of(
TEST_DOSSIER_TEMPLATE_ID); manualRedactionEntry))
.build(),
document,
TEST_DOSSIER_TEMPLATE_ID);
assertTrue(notFoundManualEntities.isEmpty()); assertTrue(notFoundManualEntities.isEmpty());
assertEquals(1, document.getEntities().size()); assertEquals(1, document.getEntities().size());
return new DocumentAndEntity(document, document.getEntities().stream().findFirst().get()); return new DocumentAndEntity(document,
document.getEntities()
.stream()
.findFirst()
.get());
} }
public static Rectangle toAnnotationRectangle(Rectangle2D rectangle2D, int pageNumber) { public static Rectangle toAnnotationRectangle(Rectangle2D rectangle2D, int pageNumber) {
return new Rectangle(new Point((float) rectangle2D.getMinX(), (float) (rectangle2D.getMinY() + rectangle2D.getHeight())), return new Rectangle(new Point((float) rectangle2D.getMinX(), (float) (rectangle2D.getMinY() + rectangle2D.getHeight())),
(float) rectangle2D.getWidth(), (float) rectangle2D.getWidth(),
-(float) rectangle2D.getHeight(), -(float) rectangle2D.getHeight(),
pageNumber); pageNumber);
} }

View File

@ -3,6 +3,7 @@ package com.iqser.red.service.redaction.v1.server.rules;
import java.util.Collection; import java.util.Collection;
import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.extension.ExtendWith;
import org.kie.api.KieServices; import org.kie.api.KieServices;
import org.kie.api.builder.KieBuilder; import org.kie.api.builder.KieBuilder;
import org.kie.api.builder.KieFileSystem; import org.kie.api.builder.KieFileSystem;
@ -10,6 +11,8 @@ import org.kie.api.builder.KieModule;
import org.kie.api.runtime.KieContainer; import org.kie.api.runtime.KieContainer;
import org.kie.api.runtime.KieSession; import org.kie.api.runtime.KieSession;
import org.kie.internal.io.ResourceFactory; import org.kie.internal.io.ResourceFactory;
import org.mockito.Mockito;
import org.mockito.junit.jupiter.MockitoExtension;
import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
@ -17,11 +20,13 @@ import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Import; import org.springframework.context.annotation.Import;
import com.iqser.red.service.redaction.v1.server.document.graph.BuildDocumentIntegrationTest; import com.iqser.red.service.redaction.v1.server.document.graph.BuildDocumentIntegrationTest;
import com.iqser.red.service.redaction.v1.server.model.dictionary.Dictionary;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document; import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService; import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService; import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityEnrichmentService; import com.iqser.red.service.redaction.v1.server.service.document.EntityEnrichmentService;
@ExtendWith(MockitoExtension.class)
public class RulesIntegrationTest extends BuildDocumentIntegrationTest { public class RulesIntegrationTest extends BuildDocumentIntegrationTest {
protected static final String RULES = "drools/rules.drl"; protected static final String RULES = "drools/rules.drl";
@ -72,10 +77,12 @@ public class RulesIntegrationTest extends BuildDocumentIntegrationTest {
@BeforeEach @BeforeEach
public void createServices() { public void createServices() {
Dictionary dict = Mockito.mock(Dictionary.class);
kieSession = kieContainer.newKieSession(); kieSession = kieContainer.newKieSession();
entityCreationService = new EntityCreationService(entityEnrichmentService, kieSession); entityCreationService = new EntityCreationService(entityEnrichmentService, kieSession);
kieSession.setGlobal("manualChangesApplicationService", manualChangesApplicationService); kieSession.setGlobal("manualChangesApplicationService", manualChangesApplicationService);
kieSession.setGlobal("entityCreationService", entityCreationService); kieSession.setGlobal("entityCreationService", entityCreationService);
kieSession.setGlobal("dictionary", dict);
} }
} }

View File

@ -8,6 +8,7 @@ import static org.mockito.Mockito.times;
import static org.mockito.Mockito.verify; import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when; import static org.mockito.Mockito.when;
import java.time.OffsetDateTime;
import java.util.List; import java.util.List;
import java.util.Optional; import java.util.Optional;
import java.util.Set; import java.util.Set;
@ -34,7 +35,6 @@ import org.springframework.test.context.junit.jupiter.SpringExtension;
import com.iqser.red.commons.jackson.ObjectMapperFactory; import com.iqser.red.commons.jackson.ObjectMapperFactory;
import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest; import com.iqser.red.service.persistence.service.v1.api.shared.model.AnalyzeRequest;
import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType; import com.iqser.red.service.persistence.service.v1.api.shared.model.RuleFileType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.AnnotationStatus;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry; import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
@ -84,6 +84,7 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
@SpyBean @SpyBean
RabbitTemplate rabbitTemplate; RabbitTemplate rabbitTemplate;
@BeforeEach @BeforeEach
public void stubClients() { public void stubClients() {
@ -101,21 +102,22 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L); when(dictionaryClient.getVersion(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(0L);
when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder() when(dictionaryClient.getAllTypesForDossier(TEST_DOSSIER_ID, true)).thenReturn(List.of(Type.builder()
.id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID) .id(DOSSIER_REDACTIONS_INDICATOR + ":" + TEST_DOSSIER_TEMPLATE_ID)
.type(DOSSIER_REDACTIONS_INDICATOR) .type(DOSSIER_REDACTIONS_INDICATOR)
.dossierTemplateId(TEST_DOSSIER_ID) .dossierTemplateId(TEST_DOSSIER_ID)
.hexColor("#ffe187") .hexColor("#ffe187")
.isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isHint(hintTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isCaseInsensitive(caseInSensitiveMap.get(DOSSIER_REDACTIONS_INDICATOR))
.isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .isRecommendation(recommendationTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR)) .rank(rankTypeMap.get(DOSSIER_REDACTIONS_INDICATOR))
.build())); .build()));
mockDictionaryCalls(null); mockDictionaryCalls(null);
when(dictionaryClient.getColors(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(colors); when(dictionaryClient.getColors(TEST_DOSSIER_TEMPLATE_ID)).thenReturn(colors);
} }
@Test @Test
@SneakyThrows @SneakyThrows
public void testManualSurroundingText() { public void testManualSurroundingText() {
@ -125,10 +127,20 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
var aoelId = UUID.randomUUID().toString(); var aoelId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId, List.of(Rectangle.builder().topLeftX(355.53775f).topLeftY(266.1895f).width(29.32224f).height(10.048125f).page(1).build()), "AOEL"); ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId,
List.of(Rectangle.builder()
.topLeftX(355.53775f)
.topLeftY(266.1895f)
.width(29.32224f)
.height(10.048125f)
.page(1)
.build()),
"AOEL");
var notFoundId = UUID.randomUUID().toString(); var notFoundId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry2 = prepareManualRedactionEntry(notFoundId, List.of(Rectangle.builder().topLeftX(1f).topLeftY(1f).width(1f).height(1f).page(1).build()), "Random"); ManualRedactionEntry manualRedactionEntry2 = prepareManualRedactionEntry(notFoundId,
List.of(Rectangle.builder().topLeftX(1f).topLeftY(1f).width(1f).height(1f).page(1).build()),
"Random");
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
manualRedactions.getEntriesToAdd().add(manualRedactionEntry2); manualRedactions.getEntriesToAdd().add(manualRedactionEntry2);
@ -147,30 +159,43 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
assertFalse(unprocessedManualEntities.isEmpty()); assertFalse(unprocessedManualEntities.isEmpty());
assertEquals(unprocessedManualEntities.size(), 2); assertEquals(unprocessedManualEntities.size(), 2);
Optional<UnprocessedManualEntity> optionalUnprocessedManualEntity = unprocessedManualEntities.stream().filter(manualEntity -> manualEntity.getAnnotationId().equals(aoelId)).findFirst(); Optional<UnprocessedManualEntity> optionalUnprocessedManualEntity = unprocessedManualEntities.stream()
.filter(manualEntity -> manualEntity.getAnnotationId().equals(aoelId))
.findFirst();
assertTrue(optionalUnprocessedManualEntity.isPresent()); assertTrue(optionalUnprocessedManualEntity.isPresent());
UnprocessedManualEntity unprocessedManualEntity = optionalUnprocessedManualEntity.get(); UnprocessedManualEntity unprocessedManualEntity = optionalUnprocessedManualEntity.get();
assertEquals(unprocessedManualEntity.getTextBefore(), "was above the "); assertEquals(unprocessedManualEntity.getTextBefore(), "was above the ");
assertEquals(unprocessedManualEntity.getTextAfter(), " without PPE (34%"); assertEquals(unprocessedManualEntity.getTextAfter(), " without PPE (34%");
assertEquals(unprocessedManualEntity.getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L"); assertEquals(unprocessedManualEntity.getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L");
assertEquals(unprocessedManualEntity.getPositions().get(0).x(), 355.53775f); assertEquals(unprocessedManualEntity.getPositions()
assertEquals(unprocessedManualEntity.getPositions().get(0).y(), 266.49002f); .get(0).x(), 355.53775f);
assertEquals(unprocessedManualEntity.getPositions().get(0).w(), 29.322266f); assertEquals(unprocessedManualEntity.getPositions()
assertEquals(unprocessedManualEntity.getPositions().get(0).h(), 11.017679f); .get(0).y(), 266.49002f);
assertEquals(unprocessedManualEntity.getPositions()
.get(0).w(), 29.322266f);
assertEquals(unprocessedManualEntity.getPositions()
.get(0).h(), 11.017679f);
Optional<UnprocessedManualEntity> optionalNotFoundUnprocessedManualEntity = unprocessedManualEntities.stream().filter(manualEntity -> manualEntity.getAnnotationId().equals(notFoundId)).findFirst(); Optional<UnprocessedManualEntity> optionalNotFoundUnprocessedManualEntity = unprocessedManualEntities.stream()
.filter(manualEntity -> manualEntity.getAnnotationId().equals(notFoundId))
.findFirst();
assertTrue(optionalNotFoundUnprocessedManualEntity.isPresent()); assertTrue(optionalNotFoundUnprocessedManualEntity.isPresent());
UnprocessedManualEntity unprocessedNotFoundManualEntity = optionalNotFoundUnprocessedManualEntity.get(); UnprocessedManualEntity unprocessedNotFoundManualEntity = optionalNotFoundUnprocessedManualEntity.get();
assertEquals(unprocessedNotFoundManualEntity.getTextBefore(), ""); assertEquals(unprocessedNotFoundManualEntity.getTextBefore(), "");
assertEquals(unprocessedNotFoundManualEntity.getTextAfter(), ""); assertEquals(unprocessedNotFoundManualEntity.getTextAfter(), "");
assertEquals(unprocessedNotFoundManualEntity.getSection(), ""); assertEquals(unprocessedNotFoundManualEntity.getSection(), "");
assertEquals(unprocessedNotFoundManualEntity.getPositions().get(0).getPageNumber(), 1); assertEquals(unprocessedNotFoundManualEntity.getPositions()
assertEquals(unprocessedNotFoundManualEntity.getPositions().get(0).getRectangle()[0], 1f); .get(0).getPageNumber(), 1);
assertEquals(unprocessedNotFoundManualEntity.getPositions().get(0).getRectangle()[1], 1f); assertEquals(unprocessedNotFoundManualEntity.getPositions()
assertEquals(unprocessedNotFoundManualEntity.getPositions().get(0).getRectangle()[2], 1f); .get(0).getRectangle()[0], 1f);
assertEquals(unprocessedNotFoundManualEntity.getPositions().get(0).getRectangle()[3], 1f); assertEquals(unprocessedNotFoundManualEntity.getPositions()
.get(0).getRectangle()[1], 1f);
assertEquals(unprocessedNotFoundManualEntity.getPositions()
.get(0).getRectangle()[2], 1f);
assertEquals(unprocessedNotFoundManualEntity.getPositions()
.get(0).getRectangle()[3], 1f);
analyzeService.reanalyze(request); analyzeService.reanalyze(request);
@ -190,10 +215,14 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
assertEquals(unprocessedManualEntities.get(0).getTextAfter(), " without PPE (34%"); assertEquals(unprocessedManualEntities.get(0).getTextAfter(), " without PPE (34%");
assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "to EFSA guidance "); assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "to EFSA guidance ");
assertEquals(unprocessedManualEntities.get(0).getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L"); assertEquals(unprocessedManualEntities.get(0).getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L");
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).x(), positions.get(0).getTopLeftX()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).y(), positions.get(0).getTopLeftY()); .get(0).x(), positions.get(0).getTopLeftX());
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).w(), positions.get(0).getWidth()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).h(), positions.get(0).getHeight()); .get(0).y(), positions.get(0).getTopLeftY());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).w(), positions.get(0).getWidth());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).h(), positions.get(0).getHeight());
} }
@ -205,13 +234,37 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
var aoelId = UUID.randomUUID().toString(); var aoelId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId, List.of(Rectangle.builder().topLeftX(384.85536f).topLeftY(240.8695f).width(13.49088f).height(10.048125f).page(1).build()), "EL"); ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId,
List.of(Rectangle.builder()
.topLeftX(384.85536f)
.topLeftY(240.8695f)
.width(13.49088f)
.height(10.048125f)
.page(1)
.build()),
"EL");
var cormsId = UUID.randomUUID().toString(); var cormsId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry2 = prepareManualRedactionEntry(cormsId, List.of(Rectangle.builder().topLeftX(129.86f).topLeftY(505.7295f).width(35.9904f).height(10.048125f).page(1).build()), "CoRMS"); ManualRedactionEntry manualRedactionEntry2 = prepareManualRedactionEntry(cormsId,
List.of(Rectangle.builder()
.topLeftX(129.86f)
.topLeftY(505.7295f)
.width(35.9904f)
.height(10.048125f)
.page(1)
.build()),
"CoRMS");
var a9Id = UUID.randomUUID().toString(); var a9Id = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry3 = prepareManualRedactionEntry(a9Id, List.of(Rectangle.builder().topLeftX(140.1096f).topLeftY(291.5095f).width(37.84512f).height(10.048125f).page(1).build()), "A9396G"); ManualRedactionEntry manualRedactionEntry3 = prepareManualRedactionEntry(a9Id,
List.of(Rectangle.builder()
.topLeftX(140.1096f)
.topLeftY(291.5095f)
.width(37.84512f)
.height(10.048125f)
.page(1)
.build()),
"A9396G");
manualRedactions.getEntriesToAdd().add(manualRedactionEntry3); manualRedactions.getEntriesToAdd().add(manualRedactionEntry3);
manualRedactions.getEntriesToAdd().add(manualRedactionEntry2); manualRedactions.getEntriesToAdd().add(manualRedactionEntry2);
@ -238,35 +291,53 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
assertFalse(unprocessedManualEntities.isEmpty()); assertFalse(unprocessedManualEntities.isEmpty());
assertEquals(unprocessedManualEntities.size(), 3); assertEquals(unprocessedManualEntities.size(), 3);
var resizedAoel = unprocessedManualEntities.stream().filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(aoelId)).findFirst(); var resizedAoel = unprocessedManualEntities.stream()
.filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(aoelId))
.findFirst();
assertTrue(resizedAoel.isPresent()); assertTrue(resizedAoel.isPresent());
assertEquals(resizedAoel.get().getTextAfter(), " (max. 43% of"); assertEquals(resizedAoel.get().getTextAfter(), " (max. 43% of");
assertEquals(resizedAoel.get().getTextBefore(), "is below the "); assertEquals(resizedAoel.get().getTextBefore(), "is below the ");
assertEquals(resizedAoel.get().getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L"); assertEquals(resizedAoel.get().getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L");
assertEquals(resizedAoel.get().getPositions().get(0).x(), positions.get(0).getTopLeftX()); assertEquals(resizedAoel.get().getPositions()
assertEquals(resizedAoel.get().getPositions().get(0).y(), positions.get(0).getTopLeftY()); .get(0).x(), positions.get(0).getTopLeftX());
assertEquals(resizedAoel.get().getPositions().get(0).w(), positions.get(0).getWidth()); assertEquals(resizedAoel.get().getPositions()
assertEquals(resizedAoel.get().getPositions().get(0).h(), positions.get(0).getHeight()); .get(0).y(), positions.get(0).getTopLeftY());
assertEquals(resizedAoel.get().getPositions()
.get(0).w(), positions.get(0).getWidth());
assertEquals(resizedAoel.get().getPositions()
.get(0).h(), positions.get(0).getHeight());
var cormsResized = unprocessedManualEntities.stream().filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(cormsId)).findFirst(); var cormsResized = unprocessedManualEntities.stream()
.filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(cormsId))
.findFirst();
assertTrue(cormsResized.isPresent()); assertTrue(cormsResized.isPresent());
assertEquals(cormsResized.get().getTextAfter(), " a NOAEL of"); assertEquals(cormsResized.get().getTextAfter(), " a NOAEL of");
assertEquals(cormsResized.get().getTextBefore(), "mg/kg bw/d. Furthermore "); assertEquals(cormsResized.get().getTextBefore(), "mg/kg bw/d. Furthermore ");
assertEquals(cormsResized.get().getSection(), "[0, 3]: Paragraph: The Co-RMS indicated the"); assertEquals(cormsResized.get().getSection(), "[0, 3]: Paragraph: The Co-RMS indicated the");
assertEquals(cormsResized.get().getPositions().get(0).x(), positions2.get(0).getTopLeftX()); assertEquals(cormsResized.get().getPositions()
assertEquals(cormsResized.get().getPositions().get(0).y(), positions2.get(0).getTopLeftY()); .get(0).x(), positions2.get(0).getTopLeftX());
assertEquals(cormsResized.get().getPositions().get(0).w(), positions2.get(0).getWidth()); assertEquals(cormsResized.get().getPositions()
assertEquals(cormsResized.get().getPositions().get(0).h(), positions2.get(0).getHeight()); .get(0).y(), positions2.get(0).getTopLeftY());
assertEquals(cormsResized.get().getPositions()
.get(0).w(), positions2.get(0).getWidth());
assertEquals(cormsResized.get().getPositions()
.get(0).h(), positions2.get(0).getHeight());
var a9Resized = unprocessedManualEntities.stream().filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(a9Id)).findFirst(); var a9Resized = unprocessedManualEntities.stream()
.filter(unprocessedManualEntity -> unprocessedManualEntity.getAnnotationId().equals(a9Id))
.findFirst();
assertTrue(a9Resized.isPresent()); assertTrue(a9Resized.isPresent());
assertEquals(a9Resized.get().getTextAfter(), " were obtained from"); assertEquals(a9Resized.get().getTextAfter(), " were obtained from");
assertEquals(a9Resized.get().getTextBefore(), "data for S"); assertEquals(a9Resized.get().getTextBefore(), "data for S");
assertEquals(a9Resized.get().getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L"); assertEquals(a9Resized.get().getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L");
assertEquals(a9Resized.get().getPositions().get(0).x(), positions3.get(0).getTopLeftX()); assertEquals(a9Resized.get().getPositions()
assertEquals(a9Resized.get().getPositions().get(0).y(), positions3.get(0).getTopLeftY()); .get(0).x(), positions3.get(0).getTopLeftX());
assertEquals(a9Resized.get().getPositions().get(0).w(), positions3.get(0).getWidth()); assertEquals(a9Resized.get().getPositions()
assertEquals(a9Resized.get().getPositions().get(0).h(), positions3.get(0).getHeight()); .get(0).y(), positions3.get(0).getTopLeftY());
assertEquals(a9Resized.get().getPositions()
.get(0).w(), positions3.get(0).getWidth());
assertEquals(a9Resized.get().getPositions()
.get(0).h(), positions3.get(0).getHeight());
} }
@ -277,7 +348,15 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
var aoelId = UUID.randomUUID().toString(); var aoelId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId, List.of(Rectangle.builder().topLeftX(384.85536f).topLeftY(240.8695f).width(13.49088f).height(10.048125f).page(1).build()), "EL"); ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId,
List.of(Rectangle.builder()
.topLeftX(384.85536f)
.topLeftY(240.8695f)
.width(13.49088f)
.height(10.048125f)
.page(1)
.build()),
"EL");
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
AnalyzeRequest request = uploadFileToStorage(pdfFile); AnalyzeRequest request = uploadFileToStorage(pdfFile);
@ -301,10 +380,14 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
assertEquals(unprocessedManualEntities.get(0).getTextAfter(), " (max. 43% of"); assertEquals(unprocessedManualEntities.get(0).getTextAfter(), " (max. 43% of");
assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "is below the "); assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "is below the ");
assertEquals(unprocessedManualEntities.get(0).getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L"); assertEquals(unprocessedManualEntities.get(0).getSection(), "[1, 1]: Paragraph: A9396G containing 960 g/L");
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).x(), positions.get(0).getTopLeftX()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).y(), positions.get(0).getTopLeftY()); .get(0).x(), positions.get(0).getTopLeftX());
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).w(), positions.get(0).getWidth()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).h(), positions.get(0).getHeight()); .get(0).y(), positions.get(0).getTopLeftY());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).w(), positions.get(0).getWidth());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).h(), positions.get(0).getHeight());
} }
@ -315,7 +398,15 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
var aoelId = UUID.randomUUID().toString(); var aoelId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId, List.of(Rectangle.builder().topLeftX(384.85536f).topLeftY(240.8695f).width(13.49088f).height(10.048125f).page(1).build()), "EL"); ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId,
List.of(Rectangle.builder()
.topLeftX(384.85536f)
.topLeftY(240.8695f)
.width(13.49088f)
.height(10.048125f)
.page(1)
.build()),
"EL");
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
AnalyzeRequest request = uploadFileToStorage(pdfFile); AnalyzeRequest request = uploadFileToStorage(pdfFile);
@ -339,10 +430,14 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
assertEquals(unprocessedManualEntities.get(0).getTextAfter(), ", the same"); assertEquals(unprocessedManualEntities.get(0).getTextAfter(), ", the same");
assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "to set an "); assertEquals(unprocessedManualEntities.get(0).getTextBefore(), "to set an ");
assertEquals(unprocessedManualEntities.get(0).getSection(), "[0, 4]: Paragraph: With respect to the"); assertEquals(unprocessedManualEntities.get(0).getSection(), "[0, 4]: Paragraph: With respect to the");
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).x(), positions.get(0).getTopLeftX()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).y(), positions.get(0).getTopLeftY()); .get(0).x(), positions.get(0).getTopLeftX());
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).w(), positions.get(0).getWidth()); assertEquals(unprocessedManualEntities.get(0).getPositions()
assertEquals(unprocessedManualEntities.get(0).getPositions().get(0).h(), positions.get(0).getHeight()); .get(0).y(), positions.get(0).getTopLeftY());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).w(), positions.get(0).getWidth());
assertEquals(unprocessedManualEntities.get(0).getPositions()
.get(0).h(), positions.get(0).getHeight());
} }
@ -353,7 +448,15 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
ManualRedactions manualRedactions = new ManualRedactions(); ManualRedactions manualRedactions = new ManualRedactions();
var aoelId = UUID.randomUUID().toString(); var aoelId = UUID.randomUUID().toString();
ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId, List.of(Rectangle.builder().topLeftX(384.85536f).topLeftY(240.8695f).width(13.49088f).height(10.048125f).page(1).build()), "EL"); ManualRedactionEntry manualRedactionEntry = prepareManualRedactionEntry(aoelId,
List.of(Rectangle.builder()
.topLeftX(384.85536f)
.topLeftY(240.8695f)
.width(13.49088f)
.height(10.048125f)
.page(1)
.build()),
"EL");
manualRedactions.getEntriesToAdd().add(manualRedactionEntry); manualRedactions.getEntriesToAdd().add(manualRedactionEntry);
AnalyzeRequest request = uploadFileToStorage(pdfFile); AnalyzeRequest request = uploadFileToStorage(pdfFile);
@ -377,25 +480,32 @@ public class UnprocessedChangesServiceTest extends AbstractRedactionIntegrationT
private static ManualResizeRedaction prepareManualSizeRedaction(String id, List<Rectangle> positions, String value) { private static ManualResizeRedaction prepareManualSizeRedaction(String id, List<Rectangle> positions, String value) {
ManualResizeRedaction manualResizeRedaction = new ManualResizeRedaction(); return ManualResizeRedaction.builder()
manualResizeRedaction.setAnnotationId(id); .annotationId(id)
manualResizeRedaction.setPositions(positions); .fileId("fileId")
manualResizeRedaction.setUpdateDictionary(false); .user("user")
manualResizeRedaction.setAddToAllDossiers(false); .positions(positions)
manualResizeRedaction.setValue(value); .updateDictionary(false)
return manualResizeRedaction; .addToAllDossiers(false)
.value(value)
.requestDate(OffsetDateTime.now())
.build();
} }
private static ManualRedactionEntry prepareManualRedactionEntry(String id, List<Rectangle> positions, String value) { private static ManualRedactionEntry prepareManualRedactionEntry(String id, List<Rectangle> positions, String value) {
ManualRedactionEntry manualRedactionEntry = new ManualRedactionEntry(); return ManualRedactionEntry.builder()
manualRedactionEntry.setAnnotationId(id); .annotationId(id)
manualRedactionEntry.setFileId("fileId"); .fileId("fileId")
manualRedactionEntry.setType("CBI_author"); .user("user")
manualRedactionEntry.setValue(value); .type("CBI_author")
manualRedactionEntry.setReason("Manual Redaction"); .value(value)
manualRedactionEntry.setPositions(positions); .reason("Manual Redaction")
return manualRedactionEntry; .processedDate(OffsetDateTime.now())
.requestDate(OffsetDateTime.now())
.positions(positions)
.build();
} }
} }

View File

@ -0,0 +1,61 @@
package com.iqser.red.service.redaction.v1.server.utils;
import java.awt.Color;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
import org.apache.pdfbox.cos.COSName;
import com.iqser.red.service.redaction.v1.server.model.document.entity.PositionOnPage;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page;
import com.knecon.fforesight.service.viewerdoc.ContentStreams;
import com.knecon.fforesight.service.viewerdoc.model.ColoredRectangle;
import com.knecon.fforesight.service.viewerdoc.model.VisualizationsOnPage;
import lombok.experimental.UtilityClass;
@UtilityClass
public class EntityVisualizationUtility {
public static final ContentStreams.Identifier ENTITY_LAYER = new ContentStreams.Identifier("Entities", COSName.getPDFName("KNECON_ENTITIES"), true);
public Map<Integer, VisualizationsOnPage> createVisualizationsOnPage(Collection<TextEntity> entity, Color color) {
Map<Integer, VisualizationsOnPage> visualizations = new HashMap<>();
Set<Page> pages = entity.stream()
.map(TextEntity::getPages)
.flatMap(Collection::stream)
.collect(Collectors.toSet());
pages.forEach(page -> visualizations.put(page.getNumber() - 1, buildVisualizationsOnPage(color, page)));
return visualizations;
}
private static VisualizationsOnPage buildVisualizationsOnPage(Color color, Page page) {
return VisualizationsOnPage.builder().coloredRectangles(getEntityRectangles(color, page)).build();
}
private static List<ColoredRectangle> getEntityRectangles(Color color, Page page) {
return page.getEntities()
.stream()
.map(TextEntity::getPositionsOnPagePerPage)
.flatMap(Collection::stream)
.filter(p -> p.getPage().equals(page))
.map(PositionOnPage::getRectanglePerLine)
.flatMap(Collection::stream)
.map(r -> new ColoredRectangle(r, color, 1))
.toList();
}
}

View File

@ -16,7 +16,6 @@ public class LayoutParsingRequestProvider {
var originFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.ORIGIN); var originFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.ORIGIN);
var tablesFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.TABLES); var tablesFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.TABLES);
var imagesFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.IMAGE_INFO); var imagesFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.IMAGE_INFO);
var sectionGridStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.SECTION_GRID);
var structureFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_STRUCTURE); var structureFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_STRUCTURE);
var textBlockFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_TEXT); var textBlockFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_TEXT);
var positionBlockFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_POSITION); var positionBlockFileStorageId = RedactionStorageService.StorageIdUtils.getStorageId(request.getDossierId(), request.getFileId(), FileType.DOCUMENT_POSITION);
@ -33,7 +32,8 @@ public class LayoutParsingRequestProvider {
.textBlockFileStorageId(textBlockFileStorageId) .textBlockFileStorageId(textBlockFileStorageId)
.positionBlockFileStorageId(positionBlockFileStorageId) .positionBlockFileStorageId(positionBlockFileStorageId)
.pageFileStorageId(pageFileStorageId) .pageFileStorageId(pageFileStorageId)
.simplifiedTextStorageId(simplifiedTextStorageId).viewerDocumentStorageId(viewerDocumentStorageId) .simplifiedTextStorageId(simplifiedTextStorageId)
.viewerDocumentStorageId(viewerDocumentStorageId)
.build(); .build();
} }

View File

@ -25,6 +25,8 @@ redaction-service:
application: application:
type: "RedactManager" type: "RedactManager"
logging.type: "CONSOLE"
storage: storage:
backend: 's3' backend: 's3'

View File

@ -153,6 +153,32 @@ rule "CBI.7.1: Do not redact Names and Addresses if published information found
$authorOrAddress.skipWithReferences("CBI.7.1", "Published Information found in row", $table.getEntitiesOfTypeInSameRow("published_information", $authorOrAddress)); $authorOrAddress.skipWithReferences("CBI.7.1", "Published Information found in row", $table.getEntitiesOfTypeInSameRow("published_information", $authorOrAddress));
end end
rule "CBI.7.2: Do not redact PII if published information found in Section without tables"
when
$section: Section(!hasTables(),
hasEntitiesOfType("published_information"),
hasEntitiesOfType("PII"))
then
$section.getEntitiesOfType("PII")
.forEach(redactionEntity -> {
redactionEntity.skipWithReferences(
"CBI.7.2",
"Published Information found in section",
$section.getEntitiesOfType("published_information")
);
});
end
rule "CBI.7.3: Do not redact PII if published information found in same table row"
when
$table: Table(hasEntitiesOfType("published_information"), hasEntitiesOfType("PII"))
$cellsWithPublishedInformation: TableCell() from $table.streamTableCellsWhichContainType("published_information").toList()
$tableCell: TableCell(row == $cellsWithPublishedInformation.row) from $table.streamTableCells().toList()
$pii: TextEntity(type() == "PII", active()) from $tableCell.getEntities()
then
$pii.skipWithReferences("CBI.7.3", "Published Information found in row", $table.getEntitiesOfTypeInSameRow("published_information", $pii));
end
// Rule unit: CBI.9 // Rule unit: CBI.9
rule "CBI.9.0: Redact all cells with Header Author(s) as CBI_author (non vertebrate study)" rule "CBI.9.0: Redact all cells with Header Author(s) as CBI_author (non vertebrate study)"
@ -181,6 +207,19 @@ rule "CBI.9.1: Redact all cells with Header Author as CBI_author (non vertebrate
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.1", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(redactionEntity -> redactionEntity.redact("CBI.9.1", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "CBI.9.2: Redact all cells with Header Author(s) as CBI_author (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author(s)"))
then
$table.streamTableCellsWithHeader("Author(s)")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.2", "Author(s) found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
// Rule unit: CBI.10 // Rule unit: CBI.10
rule "CBI.10.0: Redact all cells with Header Author(s) as CBI_author (vertebrate study)" rule "CBI.10.0: Redact all cells with Header Author(s) as CBI_author (vertebrate study)"
@ -209,6 +248,32 @@ rule "CBI.10.1: Redact all cells with Header Author as CBI_author (vertebrate st
.forEach(redactionEntity -> redactionEntity.redact("CBI.10.1", "Author found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(redactionEntity -> redactionEntity.redact("CBI.10.1", "Author found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
rule "CBI.10.2: Redact all cells with Header Author(s) as CBI_author (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author(s)"))
then
$table.streamTableCellsWithHeader("Author(s)")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.10.2", "Author(s) found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
rule "CBI.10.3: Redact all cells with Header Author as CBI_author (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author"))
then
$table.streamTableCellsWithHeader("Author")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.10.3", "Author found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
// Rule unit: CBI.11 // Rule unit: CBI.11
rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study Y/N Header" rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study Y/N Header"
@ -222,7 +287,19 @@ rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study
// Rule unit: CBI.16 // Rule unit: CBI.16
rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx (non vertebrate study)" rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.0", "Author found by \"et al\" regex", "Reg (EC) No 1107/2009 Art. 63 (2g)");
dictionary.recommendEverywhere(entity);
});
end
rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -230,12 +307,12 @@ rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx (non vertebrate study)"
then then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> { .forEach(entity -> {
entity.redact("CBI.16.0", "Author found by \"et al\" regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"); entity.redact("CBI.16.1", "Author found by \"et al\" regex", "Article 39(e)(3) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity); dictionary.recommendEverywhere(entity);
}); });
end end
rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (vertebrate study)" rule "CBI.16.2: Add CBI_author with \"et al.\" RegEx (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -243,7 +320,19 @@ rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (vertebrate study)"
then then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> { .forEach(entity -> {
entity.redact("CBI.16.1", "Author found by \"et al\" regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"); entity.redact("CBI.16.2", "Author found by \"et al\" regex", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity);
});
end
rule "CBI.16.3: Add CBI_author with \"et al.\" RegEx"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.3", "Author found by \"et al\" regex", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
dictionary.recommendEverywhere(entity); dictionary.recommendEverywhere(entity);
}); });
end end
@ -268,7 +357,19 @@ rule "CBI.17.1: Add recommendation for Addresses in Test Organism sections, with
// Rule unit: CBI.20 // Rule unit: CBI.20
rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (non vertebrate study)" rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\")"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.0", "PERFORMING LABORATORY was found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -276,12 +377,12 @@ rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJEC
then then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section) entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> { .forEach(laboratoryEntity -> {
laboratoryEntity.skip("CBI.20.0", "PERFORMING LABORATORY was found for non vertebrate study"); laboratoryEntity.skip("CBI.20.1", "PERFORMING LABORATORY was found for non vertebrate study");
dictionary.recommendEverywhere(laboratoryEntity); dictionary.recommendEverywhere(laboratoryEntity);
}); });
end end
rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (vertebrate study)" rule "CBI.20.2: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -289,54 +390,133 @@ rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJEC
then then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section) entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> { .forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.1", "PERFORMING LABORATORY was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); laboratoryEntity.redact("CBI.20.2", "PERFORMING LABORATORY was found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(laboratoryEntity); dictionary.recommendEverywhere(laboratoryEntity);
}); });
end end
rule "CBI.20.3: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\""
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.3", "PERFORMING LABORATORY was found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
// Rule unit: CBI.23
rule "CBI.23.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.0", "AUTHOR(S) was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "CBI.23.1: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.1", "AUTHOR(S) was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
//------------------------------------ PII rules ------------------------------------ //------------------------------------ PII rules ------------------------------------
// Rule unit: PII.0 // Rule unit: PII.0
rule "PII.0.0: Redact all PII (non vertebrate study)" rule "PII.0.0: Redact all PII"
when
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.0", "Personal Information found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "PII.0.1: Redact all PII (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry) $pii: TextEntity(type() == "PII", dictionaryEntry)
then then
$pii.redact("PII.0.0", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"); $pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end end
rule "PII.0.1: Redact all PII (vertebrate study)" rule "PII.0.2: Redact all PII (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry) $pii: TextEntity(type() == "PII", dictionaryEntry)
then then
$pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $pii.redact("PII.0.2", "Personal Information found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
rule "PII.0.3: Redact all PII"
when
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.3", "Personal Information found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
// Rule unit: PII.1 // Rule unit: PII.1
rule "PII.1.0: Redact Emails by RegEx (Non vertebrate study)" rule "PII.1.0: Redact Emails by RegEx"
when
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.0", "Found by Email Regex", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.1.1: Redact Emails by RegEx (Non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@")) $section: Section(containsString("@"))
then then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.0", "Found by Email Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(emailEntity -> emailEntity.redact("PII.1.1", "Found by Email Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.1.1: Redact Emails by RegEx (vertebrate study)" rule "PII.1.2: Redact Emails by RegEx (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@")) $section: Section(containsString("@"))
then then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.1", "Found by Email Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(emailEntity -> emailEntity.redact("PII.1.2", "Found by Email Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.1.5: Redact Emails by RegEx"
when
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.5", "Found by Email Regex", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
// Rule unit: PII.2 // Rule unit: PII.2
rule "PII.2.0: Redact Phone and Fax by RegEx (non vertebrate study)" rule "PII.2.0: Redact Phone and Fax by RegEx"
when
$section: Section(containsString("Contact") ||
containsString("Telephone") ||
containsString("Phone") ||
containsString("Ph.") ||
containsString("Fax") ||
containsString("Tel") ||
containsString("Ter") ||
containsString("Mobile") ||
containsString("Fel") ||
containsString("Fer"))
then
entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section)
.forEach(contactEntity -> contactEntity.redact("PII.2.0", "Found by Phone and Fax Regex", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.2.1: Redact Phone and Fax by RegEx (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("Contact") || $section: Section(containsString("Contact") ||
@ -350,11 +530,11 @@ rule "PII.2.0: Redact Phone and Fax by RegEx (non vertebrate study)"
containsString("Fel") || containsString("Fel") ||
containsString("Fer")) containsString("Fer"))
then then
entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section) entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter[^m]|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section)
.forEach(contactEntity -> contactEntity.redact("PII.2.0", "Found by Phone and Fax Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.2.1", "Found by Phone and Fax Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.2.1: Redact Phone and Fax by RegEx (vertebrate study)" rule "PII.2.2: Redact Phone and Fax by RegEx (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("Contact") || $section: Section(containsString("Contact") ||
@ -368,35 +548,42 @@ rule "PII.2.1: Redact Phone and Fax by RegEx (vertebrate study)"
containsString("Fel") || containsString("Fel") ||
containsString("Fer")) containsString("Fer"))
then then
entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section) entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter[^m]|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section)
.forEach(contactEntity -> contactEntity.redact("PII.2.1", "Found by Phone and Fax Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.2.2", "Found by Phone and Fax Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.3 // Rule unit: PII.3
rule "PII.3.0: Redact telephone numbers by RegEx (Non vertebrate study)" rule "PII.3.0: Redact telephone numbers by RegEx"
when
$section: Section(matchesRegex("[+]\\d{1,}"))
then
entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.3.0", "Telephone number found by regex", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.3.1: Redact telephone numbers by RegEx (Non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(matchesRegex("[+]\\d{1,}")) $section: Section(matchesRegex("[+]\\d{1,}"))
then then
entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.3.0", "Telephone number found by regex", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.3.1", "Telephone number found by regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.3.1: Redact telephone numbers by RegEx (vertebrate study)" rule "PII.3.2: Redact telephone numbers by RegEx (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(matchesRegex("[+]\\d{1,}")) $section: Section(matchesRegex("[+]\\d{1,}"))
then then
entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.3.1", "Telephone number found by regex", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.3.2", "Telephone number found by regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.4 // Rule unit: PII.4
rule "PII.4.0: Redact line after contact information keywords (non vertebrate study)" rule "PII.4.0: Redact line after contact information keywords"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
"Contact:", "Contact:",
"Alternative contact:", "Alternative contact:",
@ -422,7 +609,62 @@ rule "PII.4.0: Redact line after contact information keywords (non vertebrate st
.forEach(contactEntity -> contactEntity.redact("PII.4.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(contactEntity -> contactEntity.redact("PII.4.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.4.1: Redact line after contact information keywords (vertebrate study)" rule "PII.4.1: Redact line after contact information keywords"
when
$contactKeyword: String() from List.of("Contact point:",
"Contact:",
"Alternative contact:",
"European contact:",
"No:",
"Contact:",
"Tel.:",
"Tel:",
"Telephone number:",
"Telephone No:",
"Telephone:",
"Phone No.",
"Phone:",
"Fax number:",
"Fax:",
"E-mail:",
"Email:",
"e-mail:",
"E-mail address:")
$section: Section(containsString($contactKeyword))
then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
rule "PII.4.2: Redact line after contact information keywords (Non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:",
"Contact:",
"Alternative contact:",
"European contact:",
"No:",
"Contact:",
"Tel.:",
"Tel:",
"Telephone number:",
"Telephone No:",
"Telephone:",
"Phone No.",
"Phone:",
"Fax number:",
"Fax:",
"E-mail:",
"Email:",
"e-mail:",
"E-mail address:")
$section: Section(containsString($contactKeyword))
then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.2", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.4.3: Redact line after contact information keywords (Vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
@ -447,12 +689,24 @@ rule "PII.4.1: Redact line after contact information keywords (vertebrate study)
$section: Section(containsString($contactKeyword)) $section: Section(containsString($contactKeyword))
then then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section) entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(contactEntity -> contactEntity.redact("PII.4.3", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.5 // Rule unit: PII.5
rule "PII.5.0: Redact line after contact information keywords reduced (non vertebrate study)" rule "PII.5.0: Redact line after contact information keywords reduced"
when
$contactKeyword: String() from List.of("Contact point:",
"Contact:",
"Alternative contact:",
"European contact:")
$section: Section(containsString($contactKeyword))
then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.5.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.5.1: Redact line after contact information keywords reduced (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
@ -462,10 +716,10 @@ rule "PII.5.0: Redact line after contact information keywords reduced (non verte
$section: Section(containsString($contactKeyword)) $section: Section(containsString($contactKeyword))
then then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section) entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.5.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.5.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.5.1: Redact line after contact information keywords reduced (Vertebrate study)" rule "PII.5.2: Redact line after contact information keywords reduced (Vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
@ -475,12 +729,23 @@ rule "PII.5.1: Redact line after contact information keywords reduced (Vertebrat
$section: Section(containsString($contactKeyword)) $section: Section(containsString($contactKeyword))
then then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section) entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.5.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.5.2", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.6 // Rule unit: PII.6
rule "PII.6.0: Redact line between contact keywords (non vertebrate study)" rule "PII.6.0: Redact line between contact keywords"
when
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
then
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)
.forEach(contactEntity -> contactEntity.redact("PII.6.0", "Found between contact keywords", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.6.1: Redact line between contact keywords (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel"))) $section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
@ -489,10 +754,10 @@ rule "PII.6.0: Redact line between contact keywords (non vertebrate study)"
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
) )
.forEach(contactEntity -> contactEntity.redact("PII.6.0", "Found between contact keywords", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.6.1", "Found between contact keywords", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.6.1: Redact line between contact keywords (vertebrate study)" rule "PII.6.2: Redact line between contact keywords (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel"))) $section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
@ -501,12 +766,41 @@ rule "PII.6.1: Redact line between contact keywords (vertebrate study)"
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
) )
.forEach(contactEntity -> contactEntity.redact("PII.6.1", "Found between contact keywords", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.6.2", "Found between contact keywords", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.6.3: Redact line between contact keywords (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
then
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)
.forEach(contactEntity -> contactEntity.redact("PII.6.3", "Found between contact keywords", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
// Rule unit: PII.7 // Rule unit: PII.7
rule "PII.7.0: Redact contact information if applicant is found (non vertebrate study)" rule "PII.7.0: Redact contact information if applicant is found"
when
$section: Section(getHeadline().containsString("applicant") ||
getHeadline().containsString("Primary contact") ||
getHeadline().containsString("Alternative contact") ||
containsString("Applicant") ||
containsString("Telephone number:"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.7.0", "Applicant information was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.7.1: Redact contact information if applicant is found (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") || $section: Section(getHeadline().containsString("applicant") ||
@ -521,10 +815,10 @@ rule "PII.7.0: Redact contact information if applicant is found (non vertebrate
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.7.0", "Applicant information was found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.7.1", "Applicant information was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.7.1: Redact contact information if applicant is found (vertebrate study)" rule "PII.7.2: Redact contact information if applicant is found (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") || $section: Section(getHeadline().containsString("applicant") ||
@ -539,14 +833,13 @@ rule "PII.7.1: Redact contact information if applicant is found (vertebrate stud
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.7.1", "Applicant information was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.7.2", "Applicant information was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.8 // Rule unit: PII.8
rule "PII.8.0: Redact contact information if producer is found (non vertebrate study)" rule "PII.8.0: Redact contact information if producer is found"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") || $section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") || containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") || containsStringIgnoreCase("manufacturer of the active substance") ||
@ -562,7 +855,25 @@ rule "PII.8.0: Redact contact information if producer is found (non vertebrate s
.forEach(entity -> entity.redact("PII.8.0", "Producer was found", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(entity -> entity.redact("PII.8.0", "Producer was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.8.1: Redact contact information if producer is found (vertebrate study)" rule "PII.8.1: Redact contact information if producer is found (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") ||
containsStringIgnoreCase("manufacturer:") ||
containsStringIgnoreCase("Producer or producers of the active substance"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.8.1", "Producer was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.8.2: Redact contact information if producer is found (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") || $section: Section(containsStringIgnoreCase("producer of the plant protection") ||
@ -577,27 +888,25 @@ rule "PII.8.1: Redact contact information if producer is found (vertebrate study
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.8.1", "Producer was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.8.2", "Producer was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.9 // Rule unit: PII.9
rule "PII.9.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (non vertebrate study)" rule "PII.9.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\""
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE")) $document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document) entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("PII.9.0", "AUTHOR(S) was found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(authorEntity -> authorEntity.redact("PII.9.0", "AUTHOR(S) was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.9.1: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (vertebrate study)" rule "PII.9.3: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\""
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE")) $document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document) entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("PII.9.1", "AUTHOR(S) was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(authorEntity -> authorEntity.redact("PII.9.3", "AUTHOR(S) was found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
@ -654,49 +963,87 @@ rule "ETC.0.0: Purity Hint"
// Rule unit: ETC.2 // Rule unit: ETC.2
rule "ETC.2.0: Redact signatures (non vertebrate study)" rule "ETC.2.0: Redact signatures"
when
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.0", "Signature Found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "ETC.2.1: Redact signatures (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE) $signature: Image(imageType == ImageType.SIGNATURE)
then then
$signature.redact("ETC.2.0", "Signature Found", "Article 39(e)(3) of Regulation (EC) No 178/2002"); $signature.redact("ETC.2.1", "Signature Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end end
rule "ETC.2.1: Redact signatures (vertebrate study)" rule "ETC.2.2: Redact signatures (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE) $signature: Image(imageType == ImageType.SIGNATURE)
then then
$signature.redact("ETC.2.1", "Signature Found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $signature.redact("ETC.2.2", "Signature Found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
rule "ETC.2.3: Redact signatures"
when
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.3", "Signature Found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
// Rule unit: ETC.3 // Rule unit: ETC.3
rule "ETC.3.0: Skip logos (non vertebrate study)" rule "ETC.3.0: Redact logos"
when
$logo: Image(imageType == ImageType.LOGO)
then
$logo.redact("ETC.3.0", "Logo Found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "ETC.3.1: Skip logos (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO) $logo: Image(imageType == ImageType.LOGO)
then then
$logo.skip("ETC.3.0", "Logo Found"); $logo.skip("ETC.3.1", "Logo Found");
end end
rule "ETC.3.1: Redact logos (vertebrate study)" rule "ETC.3.2: Redact logos (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO) $logo: Image(imageType == ImageType.LOGO)
then then
$logo.redact("ETC.3.1", "Logo Found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $logo.redact("ETC.3.2", "Logo Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
rule "ETC.3.3: Redact logos"
when
$logo: Image(imageType == ImageType.LOGO)
then
$logo.redact("ETC.3.3", "Logo Found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
// Rule unit: ETC.5 // Rule unit: ETC.5
rule "ETC.5.0: Ignore dossier_redaction entries if confidentiality is not 'confidential'" rule "ETC.5.0: Skip dossier_redaction entries if confidentiality is 'confidential'"
when
FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction")
then
$dossierRedaction.skip("ETC.5.0", "Ignore dossier_redaction when confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node));
end
rule "ETC.5.1: Remove dossier_redaction entries if confidentiality is not 'confidential'"
salience 256
when when
not FileAttribute(label == "Confidentiality", value == "confidential") not FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction") $dossierRedaction: TextEntity(type() == "dossier_redaction")
then then
$dossierRedaction.ignore("ETC.5.0", "Ignore dossier redactions, when not confidential"); $dossierRedaction.remove("ETC.5.1", "Remove dossier_redaction when not confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node)); retract($dossierRedaction);
end end
@ -846,7 +1193,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -882,36 +1229,20 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
end end
// Rule unit: X.1
rule "X.1.0: Merge intersecting Entities of same type"
salience 64
when
$first: TextEntity($type: type(), $entityType: entityType, !resized(), active())
$second: TextEntity(intersects($first), type() == $type, entityType == $entityType, this != $first, !hasManualChanges(), active())
then
TextEntity mergedEntity = entityCreationService.mergeEntitiesOfSameType(List.of($first, $second), $type, $entityType, document);
$first.remove("X.1.0", "merge intersecting Entities of same type");
$second.remove("X.1.0", "merge intersecting Entities of same type");
retract($first);
retract($second);
mergedEntity.getIntersectingNodes().forEach(node -> update(node));
end
// Rule unit: X.2 // Rule unit: X.2
rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE" rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -924,7 +1255,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -936,7 +1267,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -949,7 +1280,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -959,7 +1290,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -967,26 +1298,26 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
// Rule unit: X.6 // Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by by entity of type ENTITY" rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges(), active()) $lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then then
$lowerRank.getIntersectingNodes().forEach(node -> update(node)); $lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY"); $lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank); retract($lowerRank);
end end
rule "X.6.1: remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity" rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), $value: value, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active(), !hasManualChanges()) $outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(intersects($higherRank), type() != $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), active(), $lowerRank.getValue().length() > $value.length()) $inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then then
$higherRank.getIntersectingNodes().forEach(node -> update(node)); $inner.getIntersectingNodes().forEach(node -> update(node));
$higherRank.remove("X.6.1", "remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity"); $inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($higherRank); retract($inner);
end end
@ -1013,6 +1344,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -1276,7 +1276,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -1312,8 +1312,8 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
@ -1325,7 +1325,7 @@ rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -1338,7 +1338,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1350,7 +1350,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -1363,7 +1363,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -1373,7 +1373,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1414,6 +1414,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -0,0 +1,953 @@
package drools
import static java.lang.String.format;
import static com.iqser.red.service.redaction.v1.server.utils.RedactionSearchUtility.anyMatch;
import static com.iqser.red.service.redaction.v1.server.utils.RedactionSearchUtility.exactMatch;
import java.util.List;
import java.util.LinkedList;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.Collection;
import java.util.stream.Stream;
import java.util.Optional;
import com.iqser.red.service.redaction.v1.server.model.document.*;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.entity.*;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule
import com.iqser.red.service.redaction.v1.server.model.document.nodes.*;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Section;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Paragraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Headline;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SectionIdentifier;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Footer;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Header;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.NodeType;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.*;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlockCollector;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.AtomicTextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.ConcatenatedTextBlock;
import com.iqser.red.service.redaction.v1.server.model.NerEntities;
import com.iqser.red.service.redaction.v1.server.model.dictionary.Dictionary;
import com.iqser.red.service.redaction.v1.server.model.dictionary.DictionaryModel;
import com.iqser.red.service.redaction.v1.server.service.document.EntityCreationService;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import com.iqser.red.service.redaction.v1.server.utils.RedactionSearchUtility;
import com.iqser.red.service.persistence.service.v1.api.shared.model.FileAttribute;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualLegalBasisChange;
global Document document
global EntityCreationService entityCreationService
global ManualChangesApplicationService manualChangesApplicationService
global Dictionary dictionary
//------------------------------------ queries ------------------------------------
query "getFileAttributes"
$fileAttribute: FileAttribute()
end
//------------------------------------ Syngenta specific rules ------------------------------------
// Rule unit: SYN.1
rule "SYN.1.0: Recommend CTL/BL laboratory that start with BL or CTL"
when
$section: Section(containsString("CT") || containsString("BL"))
then
/* Regular expression: ((\b((([Cc]T(([1ILli\/])| L|~P))|(BL))[\. ]?([\dA-Ziltphz~\/.:!]| ?[\(',][Ppi](\(e)?|([\(-?']\/))+( ?[\(\/\dA-Znasieg]+)?)\b( ?\/? ?\d+)?)|(\bCT[L1i]\b)) */
entityCreationService.byRegexIgnoreCase("((\\b((([Cc]T(([1ILli\\/])| L|~P))|(BL))[\\. ]?([\\dA-Ziltphz~\\/.:!]| ?[\\(',][Ppi](\\(e)?|([\\(-?']\\/))+( ?[\\(\\/\\dA-Znasieg]+)?)\\b( ?\\/? ?\\d+)?)|(\\bCT[L1i]\\b))", "CBI_address", EntityType.RECOMMENDATION, $section)
.forEach(entity -> entity.skip("SYN.1.0", ""));
end
//------------------------------------ CBI rules ------------------------------------
// Rule unit: CBI.0
rule "CBI.0.0: Redact CBI Authors (non vertebrate Study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entity: TextEntity(type() == "CBI_author", dictionaryEntry)
then
$entity.redact("CBI.0.0", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
rule "CBI.0.1: Redact CBI Authors (vertebrate Study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entity: TextEntity(type() == "CBI_author", dictionaryEntry)
then
$entity.redact("CBI.0.1", "Author found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
// Rule unit: CBI.1
rule "CBI.1.0: Do not redact CBI Address (non vertebrate Study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entity: TextEntity(type() == "CBI_address", dictionaryEntry)
then
$entity.skip("CBI.1.0", "Address found for Non Vertebrate Study");
end
rule "CBI.1.1: Redact CBI Address (vertebrate Study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entity: TextEntity(type() == "CBI_address", dictionaryEntry)
then
$entity.redact("CBI.1.1", "Address found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
// Rule unit: CBI.2
rule "CBI.2.0: Do not redact genitive CBI Author"
when
$entity: TextEntity(type() == "CBI_author", anyMatch(textAfter, "[''ʼˈ´`ʻ']s"))
then
entityCreationService.byTextRange($entity.getTextRange(), "CBI_author", EntityType.FALSE_POSITIVE, document)
.ifPresent(falsePositive -> falsePositive.skip("CBI.2.0", "Genitive Author found"));
end
// Rule unit: CBI.7
rule "CBI.7.0: Do not redact Names and Addresses if published information found in Section without tables"
when
$section: Section(!hasTables(),
hasEntitiesOfType("published_information"),
(hasEntitiesOfType("CBI_author") || hasEntitiesOfType("CBI_address")))
then
$section.getEntitiesOfType(List.of("CBI_author", "CBI_address"))
.forEach(redactionEntity -> {
redactionEntity.skipWithReferences(
"CBI.7.0",
"Published Information found in section",
$section.getEntitiesOfType("published_information")
);
});
end
rule "CBI.7.1: Do not redact Names and Addresses if published information found in same table row"
when
$table: Table(hasEntitiesOfType("published_information"), hasEntitiesOfType("CBI_author") || hasEntitiesOfType("CBI_address"))
$cellsWithPublishedInformation: TableCell() from $table.streamTableCellsWhichContainType("published_information").toList()
$tableCell: TableCell(row == $cellsWithPublishedInformation.row) from $table.streamTableCells().toList()
$authorOrAddress: TextEntity(type() == "CBI_author" || type() == "CBI_address", active()) from $tableCell.getEntities()
then
$authorOrAddress.skipWithReferences("CBI.7.1", "Published Information found in row", $table.getEntitiesOfTypeInSameRow("published_information", $authorOrAddress));
end
// Rule unit: CBI.9
rule "CBI.9.0: Redact all cells with Header Author(s) as CBI_author (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author(s)"))
then
$table.streamTableCellsWithHeader("Author(s)")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.0", "Author(s) found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "CBI.9.1: Redact all cells with Header Author as CBI_author (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author"))
then
$table.streamTableCellsWithHeader("Author")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.1", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
// Rule unit: CBI.10
rule "CBI.10.0: Redact all cells with Header Author(s) as CBI_author (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author(s)"))
then
$table.streamTableCellsWithHeader("Author(s)")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.10.0", "Author(s) found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "CBI.10.1: Redact all cells with Header Author as CBI_author (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author"))
then
$table.streamTableCellsWithHeader("Author")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.10.1", "Author found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
// Rule unit: CBI.11
rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study Y/N Header"
agenda-group "LOCAL_DICTIONARY_ADDS"
salience -1
when
$table: Table(hasHeader("Author(s)") && hasHeader("Vertebrate Study Y/N"))
then
$table.getEntitiesOfType("CBI_author").stream().filter(IEntity::applied).forEach(entity -> dictionary.addMultipleAuthorsAsRecommendation(entity));
end
// Rule unit: CBI.16
rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.1", "Author found by \"et al\" regex", "Article 39(e)(3) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity);
});
end
rule "CBI.16.2: Add CBI_author with \"et al.\" RegEx (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.2", "Author found by \"et al\" regex", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity);
});
end
// Rule unit: CBI.17
rule "CBI.17.0: Add recommendation for Addresses in Test Organism sections, without colon"
when
$section: Section(!hasTables(), containsString("Species") && containsString("Source") && !containsString("Species:") && !containsString("Source:"))
then
entityCreationService.lineAfterString("Source", "CBI_address", EntityType.RECOMMENDATION, $section)
.forEach(entity -> entity.skip("CBI.17.0", "Line after \"Source\" in Test Organism Section"));
end
rule "CBI.17.1: Add recommendation for Addresses in Test Organism sections, with colon"
when
$section: Section(!hasTables(), containsString("Species:"), containsString("Source:"))
then
entityCreationService.lineAfterString("Source:", "CBI_address", EntityType.RECOMMENDATION, $section)
.forEach(entity -> entity.skip("CBI.17.1", "Line after \"Source:\" in Test Animals Section"));
end
// Rule unit: CBI.20
rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.skip("CBI.20.1", "PERFORMING LABORATORY was found for non vertebrate study");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
rule "CBI.20.2: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.2", "PERFORMING LABORATORY was found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
// Rule unit: CBI.23
rule "CBI.23.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.0", "AUTHOR(S) was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "CBI.23.1: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.1", "AUTHOR(S) was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
//------------------------------------ PII rules ------------------------------------
// Rule unit: PII.0
rule "PII.0.1: Redact all PII (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
rule "PII.0.2: Redact all PII (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.2", "Personal Information found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
// Rule unit: PII.1
rule "PII.1.1: Redact Emails by RegEx (Non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.1", "Found by Email Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.1.2: Redact Emails by RegEx (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.2", "Found by Email Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.1.3: Redact typoed Emails with indicator"
when
$section: Section(containsString("@") || containsStringIgnoreCase("mail"))
then
entityCreationService.byRegexIgnoreCase("mail[:\\.\\s]{1,2}([\\w\\/\\-\\{\\(\\. ]{3,20}(@|a|f)\\s?[\\w\\/\\-\\{\\(\\. ]{3,20}(\\. \\w{2,4}\\b|\\.\\B|\\.\\w{1,4}\\b))", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.3", "Personal information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.2
rule "PII.2.1: Redact Phone and Fax by RegEx (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("Contact") ||
containsString("Telephone") ||
containsString("Phone") ||
containsString("Ph.") ||
containsString("Fax") ||
containsString("Tel") ||
containsString("Ter") ||
containsString("Mobile") ||
containsString("Fel") ||
containsString("Fer"))
then
entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter[^m]|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section)
.forEach(contactEntity -> contactEntity.redact("PII.2.1", "Found by Phone and Fax Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.2.2: Redact Phone and Fax by RegEx (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("Contact") ||
containsString("Telephone") ||
containsString("Phone") ||
containsString("Ph.") ||
containsString("Fax") ||
containsString("Tel") ||
containsString("Ter") ||
containsString("Mobile") ||
containsString("Fel") ||
containsString("Fer"))
then
entityCreationService.byRegexIgnoreCase("\\b(contact|telephone|phone|ph\\.|fax|tel|ter[^m]|mobile|fel|fer)[a-zA-Z\\s]{0,10}[:.\\s]{0,3}([\\+\\d\\(][\\s\\d\\(\\)\\-\\/\\.]{4,100}\\d)\\b", "PII", EntityType.ENTITY, 2, $section)
.forEach(contactEntity -> contactEntity.redact("PII.2.2", "Found by Phone and Fax Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.2.3: Redact phone numbers without indicators"
when
$section: Section(containsString("+"))
then
entityCreationService.byRegex("(\\+[\\dO]{1,2} )(\\([\\dO]{1,3}\\))?[\\d\\-O ]{8,15}", "PII", EntityType.ENTITY, $section)
.forEach(entity -> entity.redact("PII.2.3", "Personal information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.3
rule "PII.3.1: Redact telephone numbers by RegEx (Non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(matchesRegex("[+]\\d{1,}"))
then
entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.3.1", "Telephone number found by regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.3.2: Redact telephone numbers by RegEx (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(matchesRegex("[+]\\d{1,}"))
then
entityCreationService.byRegex("((([+]\\d{1,3} (\\d{7,12})\\b)|([+]\\d{1,3}(\\d{3,12})\\b|[+]\\d{1,3}([ -]\\(?\\d{1,6}\\)?){2,4})|[+]\\d{1,3} ?((\\d{2,6}\\)?)([ -]\\d{2,6}){1,4}))(-\\d{1,3})?\\b)", "PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.3.2", "Telephone number found by regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.7
rule "PII.7.1: Redact contact information if applicant is found (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") ||
getHeadline().containsString("Primary contact") ||
getHeadline().containsString("Alternative contact") ||
containsString("Applicant") ||
containsString("Telephone number:"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.7.1", "Applicant information was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.7.2: Redact contact information if applicant is found (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") ||
getHeadline().containsString("Primary contact") ||
getHeadline().containsString("Alternative contact") ||
containsString("Applicant") ||
containsString("Telephone number:"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.7.2", "Applicant information was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.8
rule "PII.8.1: Redact contact information if producer is found (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") ||
containsStringIgnoreCase("manufacturer:") ||
containsStringIgnoreCase("Producer or producers of the active substance"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.8.1", "Producer was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.8.2: Redact contact information if producer is found (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") ||
containsStringIgnoreCase("manufacturer:") ||
containsStringIgnoreCase("Producer or producers of the active substance"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.8.2", "Producer was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.10
rule "PII.10.0: Redact study director abbreviation"
when
$section: Section(containsString("KATH") || containsString("BECH") || containsString("KML"))
then
entityCreationService.byRegexIgnoreCase("((KATH)|(BECH)|(KML)) ?(\\d{4})","PII", EntityType.ENTITY, 1, $section)
.forEach(entity -> entity.redact("PII.10.0", "Personal information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.11
rule "PII.11.0: Redact On behalf of Sequani Ltd.:"
when
$section: Section(!hasTables(), containsString("On behalf of Sequani Ltd.: Name Title"))
then
entityCreationService.betweenStrings("On behalf of Sequani Ltd.: Name Title", "On behalf of", "PII", EntityType.ENTITY, $section)
.forEach(authorEntity -> authorEntity.redact("PII.11.0", "On behalf of Sequani Ltd.: Name Title was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
// Rule unit: PII.12
rule "PII.12.0: Expand PII entities with salutation prefix"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entityToExpand: TextEntity(type() == "PII", anyMatch(textBefore, "\\b(Mrs?|Ms|Miss|Sir|Madame?|Mme)\\s?\\.?\\s*"))
then
entityCreationService.byPrefixExpansionRegex($entityToExpand, "\\b(Mrs?|Ms|Miss|Sir|Madame?|Mme)\\s?\\.?\\s*")
.ifPresent(expandedEntity -> expandedEntity.apply("PII.12.0", "Expanded PII with salutation prefix", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.12.1: Expand PII entities with salutation prefix"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$entityToExpand: TextEntity(type() == "PII", anyMatch(textBefore, "\\b(Mrs?|Ms|Miss|Sir|Madame?|Mme)\\s?\\.?\\s*"))
then
entityCreationService.byPrefixExpansionRegex($entityToExpand, "\\b(Mrs?|Ms|Miss|Sir|Madame?|Mme)\\s?\\.?\\s*")
.ifPresent(expandedEntity -> expandedEntity.apply("PII.12.1", "Expanded PII with salutation prefix", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
//------------------------------------ Other rules ------------------------------------
// Rule unit: ETC.0
rule "ETC.0.0: Purity Hint"
when
$section: Section(containsStringIgnoreCase("purity"))
then
entityCreationService.byRegexIgnoreCase("(purity ?( of|\\(.{1,20}\\))?( ?:)?) .{0,5}[\\d\\.]+( .{0,4}\\.)? ?%", "hint_only", EntityType.HINT, 1, $section)
.forEach(hint -> hint.skip("ETC.0.0", "hint only"));
end
// Rule unit: ETC.2
rule "ETC.2.1: Redact signatures (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.1", "Signature Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
rule "ETC.2.2: Redact signatures (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.2", "Signature Found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
// Rule unit: ETC.3
rule "ETC.3.1: Skip logos (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO)
then
$logo.skip("ETC.3.1", "Logo Found");
end
rule "ETC.3.2: Redact logos (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO)
then
$logo.redact("ETC.3.2", "Logo Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
// Rule unit: ETC.5
rule "ETC.5.0: Skip dossier_redaction entries if confidentiality is 'confidential'"
when
FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction")
then
$dossierRedaction.skip("ETC.5.0", "Ignore dossier_redaction when confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node));
end
rule "ETC.5.1: Remove dossier_redaction entries if confidentiality is not 'confidential'"
salience 256
when
not FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction")
then
$dossierRedaction.remove("ETC.5.1", "Remove dossier_redaction when not confidential");
retract($dossierRedaction);
end
//------------------------------------ AI rules ------------------------------------
// Rule unit: AI.0
rule "AI.0.0: Add all NER Entities of type CBI_author"
salience 999
when
nerEntities: NerEntities(hasEntitiesOfType("CBI_author"))
then
nerEntities.streamEntitiesOfType("CBI_author")
.forEach(nerEntity -> entityCreationService.optionalByNerEntity(nerEntity, EntityType.RECOMMENDATION, document));
end
// Rule unit: AI.1
rule "AI.1.0: Combine and add NER Entities as CBI_address"
salience 999
when
nerEntities: NerEntities(hasEntitiesOfType("ORG") || hasEntitiesOfType("STREET") || hasEntitiesOfType("CITY"))
then
entityCreationService.combineNerEntitiesToCbiAddressDefaults(nerEntities, "CBI_address", EntityType.RECOMMENDATION, document).toList();
end
//------------------------------------ Manual changes rules ------------------------------------
// Rule unit: MAN.0
rule "MAN.0.0: Apply manual resize redaction"
salience 128
when
$resizeRedaction: ManualResizeRedaction($id: annotationId, $requestDate: requestDate)
not ManualResizeRedaction(annotationId == $id, requestDate.isBefore($requestDate))
$entityToBeResized: TextEntity(matchesAnnotationId($id))
then
manualChangesApplicationService.resize($entityToBeResized, $resizeRedaction);
retract($resizeRedaction);
update($entityToBeResized);
$entityToBeResized.getIntersectingNodes().forEach(node -> update(node));
end
rule "MAN.0.1: Apply manual resize redaction"
salience 128
when
$resizeRedaction: ManualResizeRedaction($id: annotationId, $requestDate: requestDate)
not ManualResizeRedaction(annotationId == $id, requestDate.isBefore($requestDate))
$imageToBeResized: Image(id == $id)
then
manualChangesApplicationService.resizeImage($imageToBeResized, $resizeRedaction);
retract($resizeRedaction);
update($imageToBeResized);
update($imageToBeResized.getParent());
end
// Rule unit: MAN.1
rule "MAN.1.0: Apply id removals that are valid and not in forced redactions to Entity"
salience 128
when
$idRemoval: IdRemoval($id: annotationId, !removeFromDictionary, !removeFromAllDossiers)
$entityToBeRemoved: TextEntity(matchesAnnotationId($id))
then
$entityToBeRemoved.getManualOverwrite().addChange($idRemoval);
update($entityToBeRemoved);
retract($idRemoval);
$entityToBeRemoved.getIntersectingNodes().forEach(node -> update(node));
end
rule "MAN.1.1: Apply id removals that are valid and not in forced redactions to Image"
salience 128
when
$idRemoval: IdRemoval($id: annotationId)
$imageEntityToBeRemoved: Image($id == id)
then
$imageEntityToBeRemoved.getManualOverwrite().addChange($idRemoval);
update($imageEntityToBeRemoved);
retract($idRemoval);
update($imageEntityToBeRemoved.getParent());
end
// Rule unit: MAN.2
rule "MAN.2.0: Apply force redaction"
salience 128
when
$force: ManualForceRedaction($id: annotationId)
$entityToForce: TextEntity(matchesAnnotationId($id))
then
$entityToForce.getManualOverwrite().addChange($force);
update($entityToForce);
$entityToForce.getIntersectingNodes().forEach(node -> update(node));
retract($force);
end
rule "MAN.2.1: Apply force redaction to images"
salience 128
when
$force: ManualForceRedaction($id: annotationId)
$imageToForce: Image(id == $id)
then
$imageToForce.getManualOverwrite().addChange($force);
update($imageToForce);
update($imageToForce.getParent());
retract($force);
end
// Rule unit: MAN.3
rule "MAN.3.0: Apply entity recategorization"
salience 128
when
$recategorization: ManualRecategorization($id: annotationId, $type: type, $requestDate: requestDate)
not ManualRecategorization($id == annotationId, requestDate.isBefore($requestDate))
$entityToBeRecategorized: TextEntity(matchesAnnotationId($id), type() != $type)
then
$entityToBeRecategorized.getIntersectingNodes().forEach(node -> update(node));
$entityToBeRecategorized.getManualOverwrite().addChange($recategorization);
update($entityToBeRecategorized);
retract($recategorization);
end
rule "MAN.3.1: Apply entity recategorization of same type"
salience 128
when
$recategorization: ManualRecategorization($id: annotationId, $type: type, $requestDate: requestDate)
not ManualRecategorization($id == annotationId, requestDate.isBefore($requestDate))
$entityToBeRecategorized: TextEntity(matchesAnnotationId($id), type() == $type)
then
$entityToBeRecategorized.getManualOverwrite().addChange($recategorization);
retract($recategorization);
end
rule "MAN.3.2: Apply image recategorization"
salience 128
when
$recategorization: ManualRecategorization($id: annotationId, $requestDate: requestDate)
not ManualRecategorization($id == annotationId, requestDate.isBefore($requestDate))
$imageToBeRecategorized: Image($id == id)
then
manualChangesApplicationService.recategorize($imageToBeRecategorized, $recategorization);
update($imageToBeRecategorized);
update($imageToBeRecategorized.getParent());
retract($recategorization);
end
rule "MAN.3.3: Apply recategorization entities by default"
salience 128
when
$entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end
// Rule unit: MAN.4
rule "MAN.4.0: Apply legal basis change"
salience 128
when
$legalBasisChange: ManualLegalBasisChange($id: annotationId)
$imageToBeRecategorized: Image($id == id)
then
$imageToBeRecategorized.getManualOverwrite().addChange($legalBasisChange);
update($imageToBeRecategorized)
retract($legalBasisChange)
end
rule "MAN.4.1: Apply legal basis change"
salience 128
when
$legalBasisChange: ManualLegalBasisChange($id: annotationId)
$entityToBeChanged: TextEntity(matchesAnnotationId($id))
then
$entityToBeChanged.getManualOverwrite().addChange($legalBasisChange);
update($entityToBeChanged)
retract($legalBasisChange)
end
//------------------------------------ Entity merging rules ------------------------------------
// Rule unit: X.0
rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65
when
$larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained);
end
// Rule unit: X.2
rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64
when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then
$entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
retract($entity)
end
// Rule unit: X.3
rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"
salience 64
when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation);
end
// Rule unit: X.4
rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"
salience 256
when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then
$entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
retract($recommendation);
end
// Rule unit: X.5
rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256
when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation);
end
rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"
salience 256
when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation);
end
// Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32
when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then
$lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank);
end
rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32
when
$outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then
$inner.getIntersectingNodes().forEach(node -> update(node));
$inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($inner);
end
// Rule unit: X.8
rule "X.8.0: Remove Entity when text range and type equals to imported Entity"
salience 257
when
$entity: TextEntity($type: type(), engines contains Engine.IMPORTED, active())
$other: TextEntity(getTextRange().equals($entity.getTextRange()), this != $entity, type() == $type, engines not contains Engine.IMPORTED)
then
$other.remove("X.8.0", "remove Entity when text range and type equals to imported Entity");
$entity.addEngines($other.getEngines());
retract($other);
end
rule "X.8.1: Remove Entity when intersected by imported Entity"
salience 256
when
$entity: TextEntity(engines contains Engine.IMPORTED, active())
$other: TextEntity(intersects($entity), this != $entity, engines not contains Engine.IMPORTED)
then
$other.remove("X.8.1", "remove Entity when intersected by imported Entity");
retract($other);
end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1
rule "FA.1.0: Remove duplicate FileAttributes"
salience 64
when
$fileAttribute: FileAttribute($label: label, $value: value)
$duplicate: FileAttribute(this != $fileAttribute, label == $label, value == $value)
then
retract($duplicate);
end
//------------------------------------ Local dictionary search rules ------------------------------------
// Rule unit: LDS.0
rule "LDS.0.0: Run local dictionary search"
agenda-group "LOCAL_DICTIONARY_ADDS"
salience -999
when
$dictionaryModel: DictionaryModel(!localEntriesWithMatchedRules.isEmpty()) from dictionary.getDictionaryModels()
then
entityCreationService.bySearchImplementation($dictionaryModel.getLocalSearch(), $dictionaryModel.getType(), EntityType.RECOMMENDATION, document)
.forEach(entity -> {
Collection<MatchedRule> matchedRules = $dictionaryModel.getMatchedRulesForLocalDictionaryEntry(entity.getValue());
matchedRules.forEach(matchedRule -> entity.addMatchedRule(matchedRule.asSkippedIfApplied()));
});
end

View File

@ -188,7 +188,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -225,7 +225,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -255,6 +255,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ Local dictionary search rules ------------------------------------ //------------------------------------ Local dictionary search rules ------------------------------------
// Rule unit: LDS.0 // Rule unit: LDS.0

View File

@ -150,7 +150,7 @@ rule "CBI.4.0: Do not redact Names and Addresses if no_redaction_indicator is fo
}); });
end end
rule "CBI.4.1: Do not redact Names and Addresses if no_redaction_indicator is found in table row" rule "CBI.4.1: Don't redact authors or addresses which appear in the same row as a vertebrate and a no_redaction_indicator"
when when
$table: Table(hasEntitiesOfType("no_redaction_indicator"), $table: Table(hasEntitiesOfType("no_redaction_indicator"),
hasEntitiesOfType("vertebrate"), hasEntitiesOfType("vertebrate"),
@ -185,7 +185,7 @@ rule "CBI.5.0: Redact Names and Addresses if no_redaction_indicator but also red
"no_redaction_indicator but also redaction_indicator found", "no_redaction_indicator but also redaction_indicator found",
"Reg (EC) No 1107/2009 Art. 63 (2g)", "Reg (EC) No 1107/2009 Art. 63 (2g)",
Stream.concat( Stream.concat(
$section.getEntitiesOfType("vertebrate").stream(), $section.getEntitiesOfType("redaction_indicator").stream(),
$section.getEntitiesOfType("no_redaction_indicator").stream()).toList() $section.getEntitiesOfType("no_redaction_indicator").stream()).toList()
); );
}); });
@ -205,7 +205,7 @@ rule "CBI.5.1: Redact Names and Addresses if no_redaction_indicator but also red
"no_redaction_indicator but also redaction_indicator found", "no_redaction_indicator but also redaction_indicator found",
"Reg (EC) No 1107/2009 Art. 63 (2g)", "Reg (EC) No 1107/2009 Art. 63 (2g)",
Stream.concat( Stream.concat(
$table.getEntitiesOfTypeInSameRow("vertebrate", entity).stream(), $table.getEntitiesOfTypeInSameRow("redaction_indicator", entity).stream(),
$table.getEntitiesOfTypeInSameRow("no_redaction_indicator", entity).stream()).toList() $table.getEntitiesOfTypeInSameRow("no_redaction_indicator", entity).stream()).toList()
); );
}); });
@ -236,11 +236,11 @@ rule "CBI.8.1: Redacted because table row contains must_redact entity"
.filter(entity -> entity.getType().equals("CBI_author") || entity.getType().equals("CBI_address")) .filter(entity -> entity.getType().equals("CBI_author") || entity.getType().equals("CBI_address"))
.forEach(entity -> { .forEach(entity -> {
entity.applyWithReferences( entity.applyWithReferences(
"CBI.8.1", "CBI.8.1",
"must_redact entity found", "Must_redact found",
"Reg (EC) No 1107/2009 Art. 63 (2g)", "Reg (EC) No 1107/2009 Art. 63 (2g)",
$table.getEntitiesOfTypeInSameRow("must_redact", entity) $table.getEntitiesOfTypeInSameRow("must_redact", entity)
); );
}); });
end end
@ -272,6 +272,19 @@ rule "CBI.9.1: Redact all cells with Header Author as CBI_author (non vertebrate
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.1", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(redactionEntity -> redactionEntity.redact("CBI.9.1", "Author found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "CBI.9.2: Redact all cells with Header Author(s) as CBI_author (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$table: Table(hasHeader("Author(s)"))
then
$table.streamTableCellsWithHeader("Author(s)")
.map(tableCell -> entityCreationService.bySemanticNode(tableCell, "CBI_author", EntityType.ENTITY))
.filter(Optional::isPresent)
.map(Optional::get)
.forEach(redactionEntity -> redactionEntity.redact("CBI.9.2", "Author(s) found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
// Rule unit: CBI.11 // Rule unit: CBI.11
rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study Y/N Header" rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study Y/N Header"
@ -285,7 +298,22 @@ rule "CBI.11.0: Recommend all CBI_author entities in Table with Vertebrate Study
// Rule unit: CBI.12 // Rule unit: CBI.12
rule "CBI.12.0: Redact and recommend TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'Yes' (non vertebrate study)" rule "CBI.12.0: Redact and recommend TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'Yes'"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$table: Table(hasHeader("Author(s)") || hasHeader("Author"), hasHeaderIgnoreCase("Vertebrate Study Y/N"))
TableCell(header, containsAnyStringIgnoreCase("Author", "Author(s)"), $authorCol: col) from $table.streamHeaders().toList()
TableCell(header, containsStringIgnoreCase("Vertebrate study Y/N"), $vertebrateCol: col) from $table.streamHeaders().toList()
$rowCell: TableCell(!header, containsAnyString("Yes", "Y"), $rowWithYes: row) from $table.streamCol($vertebrateCol).toList()
TableCell(row == $rowWithYes) from $table.streamCol($authorCol).toList()
then
entityCreationService.bySemanticNode($rowCell, "must_redact", EntityType.HINT)
.ifPresent(yesEntity -> {
yesEntity.skip("CBI.12.0", "must_redact");
});
end
rule "CBI.12.1: Redact and recommend TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'Yes' (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -295,16 +323,15 @@ rule "CBI.12.0: Redact and recommend TableCell with header 'Author' or 'Author(s
TableCell(!header, containsAnyString("Yes", "Y"), $rowWithYes: row) from $table.streamCol($vertebrateCol).toList() TableCell(!header, containsAnyString("Yes", "Y"), $rowWithYes: row) from $table.streamCol($vertebrateCol).toList()
$authorCell: TableCell(row == $rowWithYes) from $table.streamCol($authorCol).toList() $authorCell: TableCell(row == $rowWithYes) from $table.streamCol($authorCol).toList()
then then
entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY) entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY)
.ifPresent(authorEntity -> { .ifPresent(authorEntity -> {
authorEntity.redact("CBI.12.0", "Redacted because it's row belongs to a vertebrate study", "Article 39(e)(3) of Regulation (EC) No 178/2002"); authorEntity.redact("CBI.12.1", "Redacted because it's row belongs to a vertebrate study", "Article 39(e)(3) of Regulation (EC) No 178/2002");
dictionary.addMultipleAuthorsAsRecommendation(authorEntity); dictionary.addMultipleAuthorsAsRecommendation(authorEntity);
}); });
end end
rule "CBI.12.1: Redact and recommend TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'Yes' (vertebrate study)" rule "CBI.12.2: Redact and recommend TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'Yes' (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -317,13 +344,13 @@ rule "CBI.12.1: Redact and recommend TableCell with header 'Author' or 'Author(s
entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY) entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY)
.ifPresent(authorEntity -> { .ifPresent(authorEntity -> {
authorEntity.redact("CBI.12.1", "Redacted because it's row belongs to a vertebrate study", "Article 39(e)(2) of Regulation (EC) No 178/2002"); authorEntity.redact("CBI.12.2", "Redacted because it's row belongs to a vertebrate study", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.addMultipleAuthorsAsRecommendation(authorEntity); dictionary.addMultipleAuthorsAsRecommendation(authorEntity);
}); });
end end
rule "CBI.12.2: Skip TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'No'" rule "CBI.12.3: Skip TableCell with header 'Author' or 'Author(s)' and header 'Vertebrate study Y/N' with value 'No'"
when when
$table: Table(hasHeader("Author(s)") || hasHeader("Author"), hasHeaderIgnoreCase("Vertebrate Study Y/N")) $table: Table(hasHeader("Author(s)") || hasHeader("Author"), hasHeaderIgnoreCase("Vertebrate Study Y/N"))
TableCell(header, containsAnyStringIgnoreCase("Author", "Author(s)"), $authorCol: col) from $table.streamHeaders().toList() TableCell(header, containsAnyStringIgnoreCase("Author", "Author(s)"), $authorCol: col) from $table.streamHeaders().toList()
@ -332,16 +359,19 @@ rule "CBI.12.2: Skip TableCell with header 'Author' or 'Author(s)' and header 'V
$authorCell: TableCell(row == $rowWithNo) from $table.streamCol($authorCol).toList() $authorCell: TableCell(row == $rowWithNo) from $table.streamCol($authorCol).toList()
then then
entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY) entityCreationService.bySemanticNode($authorCell, "CBI_author", EntityType.ENTITY)
.ifPresent(authorEntity -> authorEntity.skip("CBI.12.2", "Not redacted because it's row does not belong to a vertebrate study")); .ifPresent(authorEntity -> authorEntity.skip("CBI.12.3", "Not redacted because it's row does not belong to a vertebrate study"));
end end
// Rule unit: CBI.14 // Rule unit: CBI.14
rule "CBI.14.0: Redact CBI_sponsor entities if preceded by \"batches produced at\"" rule "CBI.14.0: Redact CBI_sponsor entities if preceded by \"batches produced at\""
when when
$section: Section(containsStringIgnoreCase("batches produced at"))
$sponsorEntity: TextEntity(type() == "CBI_sponsor", textBefore.contains("batches produced at")) $sponsorEntity: TextEntity(type() == "CBI_sponsor", textBefore.contains("batches produced at"))
then then
$sponsorEntity.redact("CBI.14.0", "Redacted because it represents a sponsor company", "Reg (EC) No 1107/2009 Art. 63 (2g)"); $sponsorEntity.redact("CBI.14.0", "Redacted because it represents a sponsor company", "Reg (EC) No 1107/2009 Art. 63 (2g)");
entityCreationService.byString("batches produced at", "must_redact", EntityType.HINT, $section)
.forEach(entity -> entity.skip("CBI.14.0", "must_redact"));
end end
@ -362,10 +392,10 @@ rule "CBI.15.0: Redact row if row contains \"determination of residues\" and liv
containsStringIgnoreCase($residueKeyword), containsStringIgnoreCase($residueKeyword),
containsStringIgnoreCase($keyword)) containsStringIgnoreCase($keyword))
then then
entityCreationService.byString($keyword, "must_redact", EntityType.ENTITY, $section) entityCreationService.byString($keyword, "must_redact", EntityType.HINT, $section)
.toList(); .forEach(entity -> entity.skip("CBI.15.0", "must_redact"));
$section.getEntitiesOfType(List.of($keyword, $residueKeyword)) $section.getEntitiesOfType(List.of("CBI_author", "CBI_address"))
.forEach(redactionEntity -> redactionEntity.redact("CBI.15.0", "Determination of residues and keyword \"" + $keyword + "\" was found.", "Reg (EC) No 1107/2009 Art. 63 (2g)")); .forEach(redactionEntity -> redactionEntity.redact("CBI.15.0", "Determination of residues and keyword \"" + $keyword + "\" was found.", "Reg (EC) No 1107/2009 Art. 63 (2g)"));
end end
@ -383,8 +413,8 @@ rule "CBI.15.1: Redact CBI_author and CBI_address if row contains \"determinatio
$residueKeyword: String() from List.of("determination of residues", "determination of total residues") $residueKeyword: String() from List.of("determination of residues", "determination of total residues")
$table: Table(containsStringIgnoreCase($residueKeyword), containsStringIgnoreCase($keyword)) $table: Table(containsStringIgnoreCase($residueKeyword), containsStringIgnoreCase($keyword))
then then
entityCreationService.byString($keyword, "must_redact", EntityType.ENTITY, $table) entityCreationService.byString($keyword, "must_redact", EntityType.HINT, $table)
.toList(); .forEach(entity -> entity.skip("CBI.15.1", "must_redact"));
$table.streamEntitiesWhereRowContainsStringsIgnoreCase(List.of($keyword, $residueKeyword)) $table.streamEntitiesWhereRowContainsStringsIgnoreCase(List.of($keyword, $residueKeyword))
.filter(redactionEntity -> redactionEntity.isAnyType(List.of("CBI_author", "CBI_address"))) .filter(redactionEntity -> redactionEntity.isAnyType(List.of("CBI_author", "CBI_address")))
@ -393,7 +423,19 @@ rule "CBI.15.1: Redact CBI_author and CBI_address if row contains \"determinatio
// Rule unit: CBI.16 // Rule unit: CBI.16
rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx (non vertebrate study)" rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.0", "Author found by \"et al\" regex", "Reg (EC) No 1107/2009 Art. 63 (2g)");
dictionary.recommendEverywhere(entity);
});
end
rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -401,12 +443,12 @@ rule "CBI.16.0: Add CBI_author with \"et al.\" RegEx (non vertebrate study)"
then then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> { .forEach(entity -> {
entity.redact("CBI.16.0", "Author found by \"et al\" regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"); entity.redact("CBI.16.1", "Author found by \"et al\" regex", "Article 39(e)(3) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity); dictionary.recommendEverywhere(entity);
}); });
end end
rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (vertebrate study)" rule "CBI.16.2: Add CBI_author with \"et al.\" RegEx (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -414,7 +456,19 @@ rule "CBI.16.1: Add CBI_author with \"et al.\" RegEx (vertebrate study)"
then then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> { .forEach(entity -> {
entity.redact("CBI.16.1", "Author found by \"et al\" regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"); entity.redact("CBI.16.2", "Author found by \"et al\" regex", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(entity);
});
end
rule "CBI.16.3: Add CBI_author with \"et al.\" RegEx"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(containsString("et al."))
then
entityCreationService.byRegex("\\b([A-ZÄÖÜ][^\\s\\.,]+( [A-ZÄÖÜ]{1,2}\\.?)?( ?[A-ZÄÖÜ]\\.?)?) et al\\.?", "CBI_author", EntityType.ENTITY, 1, $section)
.forEach(entity -> {
entity.redact("CBI.16.3", "Author found by \"et al\" regex", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
dictionary.recommendEverywhere(entity); dictionary.recommendEverywhere(entity);
}); });
end end
@ -472,7 +526,19 @@ rule "CBI.19.0: Expand CBI_author entities with salutation prefix"
// Rule unit: CBI.20 // Rule unit: CBI.20
rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (non vertebrate study)" rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\")"
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.0", "PERFORMING LABORATORY was found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (non vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -480,12 +546,12 @@ rule "CBI.20.0: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJEC
then then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section) entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> { .forEach(laboratoryEntity -> {
laboratoryEntity.skip("CBI.20.0", "PERFORMING LABORATORY was found for non vertebrate study"); laboratoryEntity.skip("CBI.20.1", "PERFORMING LABORATORY was found for non vertebrate study");
dictionary.recommendEverywhere(laboratoryEntity); dictionary.recommendEverywhere(laboratoryEntity);
}); });
end end
rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (vertebrate study)" rule "CBI.20.2: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\" (vertebrate study)"
agenda-group "LOCAL_DICTIONARY_ADDS" agenda-group "LOCAL_DICTIONARY_ADDS"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
@ -493,56 +559,117 @@ rule "CBI.20.1: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJEC
then then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section) entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> { .forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.1", "PERFORMING LABORATORY was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); laboratoryEntity.redact("CBI.20.2", "PERFORMING LABORATORY was found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
dictionary.recommendEverywhere(laboratoryEntity); dictionary.recommendEverywhere(laboratoryEntity);
}); });
end end
rule "CBI.20.3: Redact between \"PERFORMING LABORATORY\" and \"LABORATORY PROJECT ID:\""
agenda-group "LOCAL_DICTIONARY_ADDS"
when
$section: Section(!hasTables(), containsString("PERFORMING LABORATORY:"), containsString("LABORATORY PROJECT ID:"))
then
entityCreationService.betweenStrings("PERFORMING LABORATORY:", "LABORATORY PROJECT ID:", "CBI_address", EntityType.ENTITY, $section)
.forEach(laboratoryEntity -> {
laboratoryEntity.redact("CBI.20.3", "PERFORMING LABORATORY was found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
dictionary.recommendEverywhere(laboratoryEntity);
});
end
// Rule unit: CBI.23
rule "CBI.23.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.0", "AUTHOR(S) was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "CBI.23.1: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (vertebrate study)"
when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "CBI_author", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("CBI.23.1", "AUTHOR(S) was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
//------------------------------------ PII rules ------------------------------------ //------------------------------------ PII rules ------------------------------------
// Rule unit: PII.0 // Rule unit: PII.0
rule "PII.0.0: Redact all PII (non vertebrate study)" rule "PII.0.0: Redact all PII"
when
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.0", "Personal Information found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "PII.0.1: Redact all PII (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry) $pii: TextEntity(type() == "PII", dictionaryEntry)
then then
$pii.redact("PII.0.0", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"); $pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end end
rule "PII.0.1: Redact all PII (vertebrate study)" rule "PII.0.2: Redact all PII (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry) $pii: TextEntity(type() == "PII", dictionaryEntry)
then then
$pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $pii.redact("PII.0.2", "Personal Information found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
rule "PII.0.3: Redact all PII"
when
$pii: TextEntity(type() == "PII", dictionaryEntry)
then
$pii.redact("PII.0.3", "Personal Information found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
// Rule unit: PII.1 // Rule unit: PII.1
rule "PII.1.0: Redact Emails by RegEx (Non vertebrate study)" rule "PII.1.0: Redact Emails by RegEx"
when
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.0", "Found by Email Regex", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.1.1: Redact Emails by RegEx (Non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@")) $section: Section(containsString("@"))
then then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.0", "Found by Email Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(emailEntity -> emailEntity.redact("PII.1.1", "Found by Email Regex", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.1.1: Redact Emails by RegEx (vertebrate study)" rule "PII.1.2: Redact Emails by RegEx (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsString("@")) $section: Section(containsString("@"))
then then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section) entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.1", "Found by Email Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(emailEntity -> emailEntity.redact("PII.1.2", "Found by Email Regex", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.1.5: Redact Emails by RegEx"
when
$section: Section(containsString("@"))
then
entityCreationService.byRegex("\\b([A-Za-z0-9._%+\\-]+@[A-Za-z0-9.\\-]+\\.[A-Za-z\\-]{1,23}[A-Za-z])\\b", "PII", EntityType.ENTITY, 1, $section)
.forEach(emailEntity -> emailEntity.redact("PII.1.5", "Found by Email Regex", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
// Rule unit: PII.4 // Rule unit: PII.4
rule "PII.4.0: Redact line after contact information keywords (non vertebrate study)" rule "PII.4.0: Redact line after contact information keywords"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
"Contact:", "Contact:",
"Alternative contact:", "Alternative contact:",
@ -568,7 +695,62 @@ rule "PII.4.0: Redact line after contact information keywords (non vertebrate st
.forEach(contactEntity -> contactEntity.redact("PII.4.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(contactEntity -> contactEntity.redact("PII.4.0", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.4.1: Redact line after contact information keywords (vertebrate study)" rule "PII.4.1: Redact line after contact information keywords"
when
$contactKeyword: String() from List.of("Contact point:",
"Contact:",
"Alternative contact:",
"European contact:",
"No:",
"Contact:",
"Tel.:",
"Tel:",
"Telephone number:",
"Telephone No:",
"Telephone:",
"Phone No.",
"Phone:",
"Fax number:",
"Fax:",
"E-mail:",
"Email:",
"e-mail:",
"E-mail address:")
$section: Section(containsString($contactKeyword))
then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end
rule "PII.4.2: Redact line after contact information keywords (Non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:",
"Contact:",
"Alternative contact:",
"European contact:",
"No:",
"Contact:",
"Tel.:",
"Tel:",
"Telephone number:",
"Telephone No:",
"Telephone:",
"Phone No.",
"Phone:",
"Fax number:",
"Fax:",
"E-mail:",
"Email:",
"e-mail:",
"E-mail address:")
$section: Section(containsString($contactKeyword))
then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.2", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.4.3: Redact line after contact information keywords (Vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$contactKeyword: String() from List.of("Contact point:", $contactKeyword: String() from List.of("Contact point:",
@ -593,12 +775,23 @@ rule "PII.4.1: Redact line after contact information keywords (vertebrate study)
$section: Section(containsString($contactKeyword)) $section: Section(containsString($contactKeyword))
then then
entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section) entityCreationService.lineAfterString($contactKeyword, "PII", EntityType.ENTITY, $section)
.forEach(contactEntity -> contactEntity.redact("PII.4.1", "Found after \"" + $contactKeyword + "\" contact keyword", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(contactEntity -> contactEntity.redact("PII.4.3", "Found after \"" + $contactKeyword + "\" contact keyword", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.6 // Rule unit: PII.6
rule "PII.6.0: Redact line between contact keywords (non vertebrate study)" rule "PII.6.0: Redact line between contact keywords"
when
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
then
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)
.forEach(contactEntity -> contactEntity.redact("PII.6.0", "Found between contact keywords", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.6.1: Redact line between contact keywords (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel"))) $section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
@ -607,10 +800,10 @@ rule "PII.6.0: Redact line between contact keywords (non vertebrate study)"
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
) )
.forEach(contactEntity -> contactEntity.redact("PII.6.0", "Found between contact keywords", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.6.1", "Found between contact keywords", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.6.1: Redact line between contact keywords (vertebrate study)" rule "PII.6.2: Redact line between contact keywords (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel"))) $section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
@ -619,12 +812,41 @@ rule "PII.6.1: Redact line between contact keywords (vertebrate study)"
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
) )
.forEach(contactEntity -> contactEntity.redact("PII.6.1", "Found between contact keywords", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(contactEntity -> contactEntity.redact("PII.6.2", "Found between contact keywords", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end
rule "PII.6.3: Redact line between contact keywords (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section((containsString("No:") && containsString("Fax")) || (containsString("Contact:") && containsString("Tel")))
then
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)
.forEach(contactEntity -> contactEntity.redact("PII.6.3", "Found between contact keywords", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
// Rule unit: PII.7 // Rule unit: PII.7
rule "PII.7.0: Redact contact information if applicant is found (non vertebrate study)" rule "PII.7.0: Redact contact information if applicant is found"
when
$section: Section(getHeadline().containsString("applicant") ||
getHeadline().containsString("Primary contact") ||
getHeadline().containsString("Alternative contact") ||
containsString("Applicant") ||
containsString("Telephone number:"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.7.0", "Applicant information was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end
rule "PII.7.1: Redact contact information if applicant is found (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") || $section: Section(getHeadline().containsString("applicant") ||
@ -639,10 +861,10 @@ rule "PII.7.0: Redact contact information if applicant is found (non vertebrate
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.7.0", "Applicant information was found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.7.1", "Applicant information was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end end
rule "PII.7.1: Redact contact information if applicant is found (vertebrate study)" rule "PII.7.2: Redact contact information if applicant is found (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(getHeadline().containsString("applicant") || $section: Section(getHeadline().containsString("applicant") ||
@ -657,14 +879,13 @@ rule "PII.7.1: Redact contact information if applicant is found (vertebrate stud
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.7.1", "Applicant information was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.7.2", "Applicant information was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.8 // Rule unit: PII.8
rule "PII.8.0: Redact contact information if producer is found (non vertebrate study)" rule "PII.8.0: Redact contact information if producer is found"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") || $section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") || containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") || containsStringIgnoreCase("manufacturer of the active substance") ||
@ -680,7 +901,25 @@ rule "PII.8.0: Redact contact information if producer is found (non vertebrate s
.forEach(entity -> entity.redact("PII.8.0", "Producer was found", "Reg (EC) No 1107/2009 Art. 63 (2e)")); .forEach(entity -> entity.redact("PII.8.0", "Producer was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.8.1: Redact contact information if producer is found (vertebrate study)" rule "PII.8.1: Redact contact information if producer is found (non vertebrate study)"
when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") ||
containsStringIgnoreCase("producer of the active substance") ||
containsStringIgnoreCase("manufacturer of the active substance") ||
containsStringIgnoreCase("manufacturer:") ||
containsStringIgnoreCase("Producer or producers of the active substance"))
then
Stream.concat(entityCreationService.lineAfterStrings(List.of("Contact point:", "Contact:", "Alternative contact:", "European contact:", "No:", "Contact:", "Tel.:", "Tel:", "Telephone number:",
"Telephone No:", "Telephone:", "Phone No.", "Phone:", "Fax number:", "Fax:", "E-mail:", "Email:", "e-mail:", "E-mail address:"), "PII", EntityType.ENTITY, $section),
Stream.concat(
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
))
.forEach(entity -> entity.redact("PII.8.1", "Producer was found", "Article 39(e)(3) of Regulation (EC) No 178/2002"));
end
rule "PII.8.2: Redact contact information if producer is found (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$section: Section(containsStringIgnoreCase("producer of the plant protection") || $section: Section(containsStringIgnoreCase("producer of the plant protection") ||
@ -695,27 +934,25 @@ rule "PII.8.1: Redact contact information if producer is found (vertebrate study
entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section), entityCreationService.betweenStrings("No:", "Fax", "PII", EntityType.ENTITY, $section),
entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section) entityCreationService.betweenStrings("Contact:", "Tel", "PII", EntityType.ENTITY, $section)
)) ))
.forEach(entity -> entity.redact("PII.8.1", "Producer was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(entity -> entity.redact("PII.8.2", "Producer was found", "Article 39(e)(2) of Regulation (EC) No 178/2002"));
end end
// Rule unit: PII.9 // Rule unit: PII.9
rule "PII.9.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (non vertebrate study)" rule "PII.9.0: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\""
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE")) $document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document) entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("PII.9.0", "AUTHOR(S) was found", "Article 39(e)(3) of Regulation (EC) No 178/2002")); .forEach(authorEntity -> authorEntity.redact("PII.9.0", "AUTHOR(S) was found", "Reg (EC) No 1107/2009 Art. 63 (2e)"));
end end
rule "PII.9.1: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\" (vertebrate study)" rule "PII.9.3: Redact between \"AUTHOR(S)\" and \"(STUDY) COMPLETION DATE\""
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE")) $document: Document(containsStringIgnoreCase("AUTHOR(S)"), containsAnyStringIgnoreCase("COMPLETION DATE", "STUDY COMPLETION DATE"))
then then
entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document) entityCreationService.shortestBetweenAnyStringIgnoreCase(List.of("AUTHOR(S)", "AUTHOR(S):"), List.of("COMPLETION DATE", "COMPLETION DATE:", "STUDY COMPLETION DATE", "STUDY COMPLETION DATE:"), "PII", EntityType.ENTITY, $document)
.forEach(authorEntity -> authorEntity.redact("PII.9.1", "AUTHOR(S) was found", "Article 39(e)(2) of Regulation (EC) No 178/2002")); .forEach(authorEntity -> authorEntity.redact("PII.9.3", "AUTHOR(S) was found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)"));
end end
@ -762,38 +999,66 @@ rule "ETC.1.0: Redact Purity"
// Rule unit: ETC.2 // Rule unit: ETC.2
rule "ETC.2.0: Redact signatures (non vertebrate study)" rule "ETC.2.0: Redact signatures"
when
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.0", "Signature Found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "ETC.2.1: Redact signatures (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE) $signature: Image(imageType == ImageType.SIGNATURE)
then then
$signature.redact("ETC.2.0", "Signature Found", "Article 39(e)(3) of Regulation (EC) No 178/2002"); $signature.redact("ETC.2.1", "Signature Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end end
rule "ETC.2.1: Redact signatures (vertebrate study)" rule "ETC.2.2: Redact signatures (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$signature: Image(imageType == ImageType.SIGNATURE) $signature: Image(imageType == ImageType.SIGNATURE)
then then
$signature.redact("ETC.2.1", "Signature Found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $signature.redact("ETC.2.2", "Signature Found", "Article 39(e)(2) of Regulation (EC) No 178/2002");
end
rule "ETC.2.3: Redact signatures"
when
$signature: Image(imageType == ImageType.SIGNATURE)
then
$signature.redact("ETC.2.3", "Signature Found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
// Rule unit: ETC.3 // Rule unit: ETC.3
rule "ETC.3.0: Skip logos (non vertebrate study)" rule "ETC.3.0: Redact logos"
when
$logo: Image(imageType == ImageType.LOGO)
then
$logo.redact("ETC.3.0", "Logo Found", "Reg (EC) No 1107/2009 Art. 63 (2g)");
end
rule "ETC.3.1: Skip logos (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO) $logo: Image(imageType == ImageType.LOGO)
then then
$logo.skip("ETC.3.0", "Logo Found"); $logo.skip("ETC.3.1", "Logo Found");
end end
rule "ETC.3.1: Redact logos (vertebrate study)" rule "ETC.3.2: Redact logos (vertebrate study)"
when when
FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$logo: Image(imageType == ImageType.LOGO) $logo: Image(imageType == ImageType.LOGO)
then then
$logo.redact("ETC.3.1", "Logo Found", "Article 39(e)(2) of Regulation (EC) No 178/2002"); $logo.redact("ETC.3.2", "Logo Found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end
rule "ETC.3.3: Redact logos"
when
$logo: Image(imageType == ImageType.LOGO)
then
$logo.redact("ETC.3.3", "Logo Found", "Article 4(1)(b), Regulation (EC) No 1049/2001 (Personal data)");
end end
@ -807,13 +1072,23 @@ rule "ETC.4.0: Redact dossier dictionary entries"
// Rule unit: ETC.5 // Rule unit: ETC.5
rule "ETC.5.0: Ignore dossier_redaction entries if confidentiality is not 'confidential'" rule "ETC.5.0: Skip dossier_redaction entries if confidentiality is 'confidential'"
when
FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction")
then
$dossierRedaction.skip("ETC.5.0", "Ignore dossier_redaction when confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node));
end
rule "ETC.5.1: Remove dossier_redaction entries if confidentiality is not 'confidential'"
salience 256
when when
not FileAttribute(label == "Confidentiality", value == "confidential") not FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction") $dossierRedaction: TextEntity(type() == "dossier_redaction")
then then
$dossierRedaction.ignore("ETC.5.0", "Ignore dossier redactions, when not confidential"); $dossierRedaction.remove("ETC.5.1", "Remove dossier_redaction when not confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node)); retract($dossierRedaction);
end end
@ -1006,7 +1281,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -1042,36 +1317,20 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
end end
// Rule unit: X.1
rule "X.1.0: Merge intersecting Entities of same type"
salience 64
when
$first: TextEntity($type: type(), $entityType: entityType, !resized(), active())
$second: TextEntity(intersects($first), type() == $type, entityType == $entityType, this != $first, !hasManualChanges(), active())
then
TextEntity mergedEntity = entityCreationService.mergeEntitiesOfSameType(List.of($first, $second), $type, $entityType, document);
$first.remove("X.1.0", "merge intersecting Entities of same type");
$second.remove("X.1.0", "merge intersecting Entities of same type");
retract($first);
retract($second);
mergedEntity.getIntersectingNodes().forEach(node -> update(node));
end
// Rule unit: X.2 // Rule unit: X.2
rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE" rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -1084,7 +1343,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1096,7 +1355,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -1109,7 +1368,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -1119,7 +1378,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1127,26 +1386,26 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
// Rule unit: X.6 // Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by by entity of type ENTITY" rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges(), active()) $lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then then
$lowerRank.getIntersectingNodes().forEach(node -> update(node)); $lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY"); $lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank); retract($lowerRank);
end end
rule "X.6.1: remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity" rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), $value: value, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active(), !hasManualChanges()) $outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(intersects($higherRank), type() != $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), active(), $lowerRank.getValue().length() > $value.length()) $inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then then
$higherRank.getIntersectingNodes().forEach(node -> update(node)); $inner.getIntersectingNodes().forEach(node -> update(node));
$higherRank.remove("X.6.1", "remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity"); $inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($higherRank); retract($inner);
end end
@ -1173,6 +1432,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -80,12 +80,12 @@ rule "CBI.0.0: Redact CBI Authors (non vertebrate Study)"
//------------------------------------ PII rules ------------------------------------ //------------------------------------ PII rules ------------------------------------
// Rule unit: PII.0 // Rule unit: PII.0
rule "PII.0.0: Redact all PII (non vertebrate study)" rule "PII.0.1: Redact all PII (non vertebrate study)"
when when
not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y") not FileAttribute(label == "Vertebrate Study", value soundslike "Yes" || value.toLowerCase() == "y")
$pii: TextEntity(type() == "PII", dictionaryEntry) $pii: TextEntity(type() == "PII", dictionaryEntry)
then then
$pii.redact("PII.0.0", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002"); $pii.redact("PII.0.1", "Personal Information found", "Article 39(e)(3) of Regulation (EC) No 178/2002");
end end
@ -212,7 +212,7 @@ rule "MAN.3.1: Apply entity recategorization of same type"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -248,36 +248,20 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
end end
// Rule unit: X.1
rule "X.1.0: Merge intersecting Entities of same type"
salience 64
when
$first: TextEntity($type: type(), $entityType: entityType, !resized(), active())
$second: TextEntity(intersects($first), type() == $type, entityType == $entityType, this != $first, !hasManualChanges(), active())
then
TextEntity mergedEntity = entityCreationService.mergeEntitiesOfSameType(List.of($first, $second), $type, $entityType, document);
$first.remove("X.1.0", "merge intersecting Entities of same type");
$second.remove("X.1.0", "merge intersecting Entities of same type");
retract($first);
retract($second);
mergedEntity.getIntersectingNodes().forEach(node -> update(node));
end
// Rule unit: X.2 // Rule unit: X.2
rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE" rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -290,7 +274,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -302,7 +286,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -315,7 +299,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -325,7 +309,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -333,26 +317,26 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
// Rule unit: X.6 // Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by by entity of type ENTITY" rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges(), active()) $lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then then
$lowerRank.getIntersectingNodes().forEach(node -> update(node)); $lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY"); $lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank); retract($lowerRank);
end end
rule "X.6.1: remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity" rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), $value: value, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active(), !hasManualChanges()) $outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(intersects($higherRank), type() != $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), active(), $lowerRank.getValue().length() > $value.length()) $inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then then
$higherRank.getIntersectingNodes().forEach(node -> update(node)); $inner.getIntersectingNodes().forEach(node -> update(node));
$higherRank.remove("X.6.1", "remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity"); $inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($higherRank); retract($inner);
end end
@ -379,6 +363,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -338,7 +338,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -374,8 +374,8 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
@ -387,7 +387,7 @@ rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -400,7 +400,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -412,7 +412,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -425,7 +425,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -435,7 +435,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -443,26 +443,26 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
// Rule unit: X.6 // Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by by entity of type ENTITY" rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges(), active()) $lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then then
$lowerRank.getIntersectingNodes().forEach(node -> update(node)); $lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY"); $lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank); retract($lowerRank);
end end
rule "X.6.1: remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity" rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), $value: value, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active(), !hasManualChanges()) $outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(intersects($higherRank), type() != $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), active(), $lowerRank.getValue().length() > $value.length()) $inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then then
$higherRank.getIntersectingNodes().forEach(node -> update(node)); $inner.getIntersectingNodes().forEach(node -> update(node));
$higherRank.remove("X.6.1", "remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity"); $inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($higherRank); retract($inner);
end end
@ -500,6 +500,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -238,7 +238,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -274,36 +274,20 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
end end
// Rule unit: X.1
rule "X.1.0: Merge intersecting Entities of same type"
salience 64
when
$first: TextEntity($type: type(), $entityType: entityType, !resized(), active())
$second: TextEntity(intersects($first), type() == $type, entityType == $entityType, this != $first, !hasManualChanges(), active())
then
TextEntity mergedEntity = entityCreationService.mergeEntitiesOfSameType(List.of($first, $second), $type, $entityType, document);
$first.remove("X.1.0", "merge intersecting Entities of same type");
$second.remove("X.1.0", "merge intersecting Entities of same type");
retract($first);
retract($second);
mergedEntity.getIntersectingNodes().forEach(node -> update(node));
end
// Rule unit: X.2 // Rule unit: X.2
rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE" rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -316,7 +300,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -328,7 +312,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -341,7 +325,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -351,7 +335,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -359,26 +343,26 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
// Rule unit: X.6 // Rule unit: X.6
rule "X.6.0: Remove Entity of lower rank, when contained by by entity of type ENTITY" rule "X.6.0: Remove Entity of lower rank, when contained by entity of type ENTITY or HINT"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $higherRank: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges(), active()) $lowerRank: TextEntity(containedBy($higherRank), type() != $type, dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), !hasManualChanges())
then then
$lowerRank.getIntersectingNodes().forEach(node -> update(node)); $lowerRank.getIntersectingNodes().forEach(node -> update(node));
$lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY"); $lowerRank.remove("X.6.0", "remove Entity of lower rank, when contained by entity of type ENTITY or HINT");
retract($lowerRank); retract($lowerRank);
end end
rule "X.6.1: remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity" rule "X.6.1: remove Entity, when contained in another entity of type ENTITY or HINT with larger text range"
salience 32 salience 32
when when
$higherRank: TextEntity($type: type(), $value: value, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active(), !hasManualChanges()) $outer: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$lowerRank: TextEntity(intersects($higherRank), type() != $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), dictionary.getDictionaryRank(type) < dictionary.getDictionaryRank($type), active(), $lowerRank.getValue().length() > $value.length()) $inner: TextEntity(containedBy($outer), type() != $type, $outer.getTextRange().length > getTextRange().length(), !hasManualChanges())
then then
$higherRank.getIntersectingNodes().forEach(node -> update(node)); $inner.getIntersectingNodes().forEach(node -> update(node));
$higherRank.remove("X.6.1", "remove Entity of higher rank, when intersected by entity of type ENTITY and length of lower rank Entity is bigger than the higher rank Entity"); $inner.remove("X.6.1", "remove Entity, when contained in another entity of type ENTITY or HINT with larger text range");
retract($higherRank); retract($inner);
end end

View File

@ -1,16 +0,0 @@
<Configuration>
<Appenders>
<Console name="CONSOLE" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</Console>
</Appenders>
<Loggers>
<Root level="warn">
<AppenderRef ref="CONSOLE"/>
</Root>
<Logger name="com.iqser" level="info"/>
</Loggers>
</Configuration>

View File

@ -0,0 +1,17 @@
<configuration>
<springProperty scope="configuration" name="logType" source="logging.type"/>
<springProperty scope="context" name="application.name" source="spring.application.name"/>
<springProperty scope="context" name="version" source="project.version"/>
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<include resource="org/springframework/boot/logging/logback/console-appender.xml"/>
<appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
</appender>
<root level="INFO">
<appender-ref ref="${logType}"/>
</root>
</configuration>

View File

@ -449,14 +449,23 @@ rule "ETC.3.1: Redact logos (non vertebrate study)"
// Rule unit: ETC.5 // Rule unit: ETC.5
rule "ETC.5.0: Ignore dossier_redaction entries if confidentiality is not 'confidential'" rule "ETC.5.0: Skip dossier_redaction entries if confidentiality is 'confidential'"
when
FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction")
then
$dossierRedaction.skip("ETC.5.0", "Ignore dossier_redaction when confidential");
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node));
end
rule "ETC.5.1: Remove dossier_redaction entries if confidentiality is not 'confidential'"
salience 256
when when
not FileAttribute(label == "Confidentiality", value == "confidential") not FileAttribute(label == "Confidentiality", value == "confidential")
$dossierRedaction: TextEntity(type() == "dossier_redaction") $dossierRedaction: TextEntity(type() == "dossier_redaction")
then then
$dossierRedaction.ignore("ETC.5.0", "Ignore dossier redactions, when not confidential"); $dossierRedaction.remove("ETC.5.1", "Remove dossier_redaction when not confidential");
update($dossierRedaction); retract($dossierRedaction);
$dossierRedaction.getIntersectingNodes().forEach(node -> update(node));
end end

View File

@ -1423,7 +1423,7 @@ rule "MAN.3.2: Apply image recategorization"
rule "MAN.3.3: Apply recategorization entities by default" rule "MAN.3.3: Apply recategorization entities by default"
salience 128 salience 128
when when
$entity: IEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type())) $entity: TextEntity(getManualOverwrite().getRecategorized().orElse(false), !dictionary.isHint(type()))
then then
$entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis()); $entity.apply("MAN.3.3", "Recategorized entities are applied by default.", $entity.legalBasis());
end end
@ -1458,8 +1458,8 @@ rule "MAN.4.1: Apply legal basis change"
rule "X.0.0: Remove Entity contained by Entity of same type" rule "X.0.0: Remove Entity contained by Entity of same type"
salience 65 salience 65
when when
$larger: TextEntity($type: type(), $entityType: entityType, active() || skipped()) $larger: TextEntity($type: type(), $entityType: entityType, !removed())
$contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges(), active()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges())
then then
$contained.remove("X.0.0", "remove Entity contained by Entity of same type"); $contained.remove("X.0.0", "remove Entity contained by Entity of same type");
retract($contained); retract($contained);
@ -1471,7 +1471,7 @@ rule "X.2.0: Remove Entity of type ENTITY when contained by FALSE_POSITIVE"
salience 64 salience 64
when when
$falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active()) $falsePositive: TextEntity($type: type(), entityType == EntityType.FALSE_POSITIVE, active())
$entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges(), active()) $entity: TextEntity(containedBy($falsePositive), type() == $type, (entityType == EntityType.ENTITY || entityType == EntityType.HINT), !hasManualChanges())
then then
$entity.getIntersectingNodes().forEach(node -> update(node)); $entity.getIntersectingNodes().forEach(node -> update(node));
$entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE"); $entity.remove("X.2.0", "remove Entity of type ENTITY when contained by FALSE_POSITIVE");
@ -1484,7 +1484,7 @@ rule "X.3.0: Remove Entity of type RECOMMENDATION when contained by FALSE_RECOMM
salience 64 salience 64
when when
$falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active()) $falseRecommendation: TextEntity($type: type(), entityType == EntityType.FALSE_RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($falseRecommendation), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION"); $recommendation.remove("X.3.0", "remove Entity of type RECOMMENDATION when contained by FALSE_RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1496,7 +1496,7 @@ rule "X.4.0: Remove Entity of type RECOMMENDATION when text range equals ENTITY
salience 256 salience 256
when when
$entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity($type: type(), (entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(getTextRange().equals($entity.getTextRange()), type() == $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$entity.addEngines($recommendation.getEngines()); $entity.addEngines($recommendation.getEngines());
$recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type"); $recommendation.remove("X.4.0", "remove Entity of type RECOMMENDATION when text range equals ENTITY with same type");
@ -1509,7 +1509,7 @@ rule "X.5.0: Remove Entity of type RECOMMENDATION when intersected by ENTITY"
salience 256 salience 256
when when
$entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active()) $entity: TextEntity((entityType == EntityType.ENTITY || entityType == EntityType.HINT), active())
$recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(intersects($entity), entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY"); $recommendation.remove("X.5.0", "remove Entity of type RECOMMENDATION when intersected by ENTITY");
retract($recommendation); retract($recommendation);
@ -1521,7 +1521,7 @@ rule "X.5.1: Remove Entity of type RECOMMENDATION when contained by RECOMMENDATI
salience 256 salience 256
when when
$entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active()) $entity: TextEntity($type: type(), entityType == EntityType.RECOMMENDATION, active())
$recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges(), active()) $recommendation: TextEntity(containedBy($entity), type() != $type, entityType == EntityType.RECOMMENDATION, !hasManualChanges())
then then
$recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION"); $recommendation.remove("X.5.1", "remove Entity of type RECOMMENDATION when contained by RECOMMENDATION");
retract($recommendation); retract($recommendation);
@ -1562,6 +1562,18 @@ rule "X.8.1: Remove Entity when intersected by imported Entity"
end end
// Rule unit: X.11
rule "X.11.0: Remove dictionary entity which intersects with a manual entity"
salience 64
when
$manualEntity: TextEntity(engines contains Engine.MANUAL, active())
$dictionaryEntity: TextEntity(intersects($manualEntity), dictionaryEntry, engines not contains Engine.MANUAL)
then
$dictionaryEntity.remove("X.11.0", "remove dictionary entity which intersects with a manual entity");
retract($dictionaryEntity);
end
//------------------------------------ File attributes rules ------------------------------------ //------------------------------------ File attributes rules ------------------------------------
// Rule unit: FA.1 // Rule unit: FA.1

View File

@ -26,8 +26,8 @@ public class RuleFileMigrationTest {
// Put your redaction service drools paths and dossier-templates paths both RM and DM here // Put your redaction service drools paths and dossier-templates paths both RM and DM here
static final List<String> ruleFileDirs = List.of( static final List<String> ruleFileDirs = List.of(
"/home/kschuettler/iqser/redaction/redaction-service/redaction-service-v1/redaction-service-server-v1/src/test/resources/drools", "/home/kschuettler/iqser/redaction/redaction-service/redaction-service-v1/redaction-service-server-v1/src/test/resources/drools",
"/home/kschuettler/iqser/fforesight/dossier-templates-v2/", "/home/kschuettler/iqser/fforesight/dossier-templates-v2",
"/home/kschuettler/iqser/redaction/dossier-templates-v2/"); "/home/kschuettler/iqser/redaction/dossier-templates-v2");
@Test @Test
@ -36,7 +36,11 @@ public class RuleFileMigrationTest {
void migrateAllEntityRules() { void migrateAllEntityRules() {
for (String ruleFileDir : ruleFileDirs) { for (String ruleFileDir : ruleFileDirs) {
Files.walk(Path.of(ruleFileDir)).filter(this::isEntityRuleFile).map(Path::toFile).peek(System.out::println).forEach(RuleFileMigrator::migrateFile); Files.walk(Path.of(ruleFileDir))
.filter(this::isEntityRuleFile)
.map(Path::toFile)
.peek(System.out::println)
.forEach(RuleFileMigrator::migrateFile);
} }
} }