Compare commits

..

44 Commits

Author SHA1 Message Date
maverickstuder
557990273d RED-9859: update intersecting nodes on kie session insertion 2024-09-12 12:13:03 +02:00
maverickstuder
63041927fc RED-9859: Redactions found by et. al. rule not skipped with published information
- modify unit test
2024-09-10 14:58:46 +02:00
maverickstuder
10e0c68a1f RED-9859: Redactions found by et. al. rule not skipped with published information
- switch CBI.7.* with and CBI.16.*
2024-09-10 14:58:45 +02:00
Andrei Isvoran
3ea73aa859 Merge branch 'RED-9986-bp' into 'release/4.348.x'
RED-9986 - Add component rules path to be scanned for Javadoc generation

See merge request redactmanager/redaction-service!511
2024-09-09 12:44:01 +02:00
Andrei Isvoran
179ac6d9ad RED-9986 - Add component rules path to be scanned for Javadoc generation 2024-09-09 11:33:31 +03:00
Kilian Schüttler
bdc6ab7e96 Merge branch 'RED-9964' into 'release/4.348.x'
RED-9964: fix errors with images

See merge request redactmanager/redaction-service!505
2024-09-04 09:16:11 +02:00
Kilian Schüttler
e959d60ec0 RED-9964: fix errors with images 2024-09-04 09:16:11 +02:00
Kilian Schüttler
6b6d06d24e Merge branch 'RED-9964' into 'release/4.348.x'
RED-9964: refactor getMainBody() and getMainBodyTextBlock() in Page

See merge request redactmanager/redaction-service!502
2024-09-02 16:51:05 +02:00
Kilian Schüttler
8ac0657795 RED-9964: refactor getMainBody() and getMainBodyTextBlock() in Page 2024-09-02 16:51:04 +02:00
Maverick Studer
dad17bb504 Merge branch 'RED-9865-bp2' into 'release/4.348.x'
RED-9865: fix for case 2

See merge request redactmanager/redaction-service!495
2024-08-23 17:01:21 +02:00
Maverick Studer
f445b7fe69 RED-9865: fix for case 2 2024-08-23 17:01:21 +02:00
Dominique Eifländer
0692cc90e4 Merge branch 'RED-9837-4.1' into 'release/4.348.x'
RED-9837: Fixed not working timeout with endless loop in drools then block

See merge request redactmanager/redaction-service!490
2024-08-19 14:08:31 +02:00
Dominique Eifländer
ab114b0920 RED-9837: Fixed not working timeout with endless loop in drools then block 2024-08-19 13:20:04 +02:00
Dominique Eifländer
7396c04314 Merge branch 'RED-9760-4.1-rules' into 'release/4.348.x'
RED-9760: Do not check blacklisted keywords in Strings

See merge request redactmanager/redaction-service!488
2024-08-13 12:01:49 +02:00
Dominique Eifländer
305cd8f5ac RED-9760: Do not check blacklisted keywords in Strings 2024-08-13 11:11:25 +02:00
Maverick Studer
b4ecbde89e Merge branch 'RED-9782-fix-bp' into 'release/4.348.x'
RED-9782: Automated Analysis should be disabled when uploading a document that...

See merge request redactmanager/redaction-service!484
2024-08-12 18:40:56 +02:00
Maverick Studer
7c31d4f70b RED-9782: Automated Analysis should be disabled when uploading a document that... 2024-08-12 18:40:56 +02:00
Kilian Schüttler
f08654a082 Merge branch 'hotfix' into 'release/4.348.x'
Fix UOE in ComponentDroolsExecutionService

See merge request redactmanager/redaction-service!482
2024-08-12 15:59:14 +02:00
Kilian Schüttler
2cf7f7c7b2 Fix UOE in ComponentDroolsExecutionService 2024-08-12 15:59:14 +02:00
Kilian Schüttler
a51f10b9d1 Merge branch 'RED-9869-bp' into 'release/4.348.x'
RED-9869: allow java.text and find ruleIdentifiers with whitespaces/linebreaks

See merge request redactmanager/redaction-service!480
2024-08-12 15:19:40 +02:00
Kilian Schuettler
8c36035655 RED-9869: allow java.text and find ruleIdentifiers with whitespaces/linebreaks 2024-08-12 12:39:42 +02:00
Maverick Studer
67bb4fe7f9 Merge branch 'hotfixes-dm-release' into 'release/4.348.x'
Hotfixes dm release

See merge request redactmanager/redaction-service!477
2024-08-09 16:52:40 +02:00
Maverick Studer
2a9101306c Hotfixes dm release 2024-08-09 16:52:40 +02:00
Maverick Studer
6ecac11df5 Merge branch 'RED-9857-bp' into 'release/4.348.x'
RED-9857: Add new date format

See merge request redactmanager/redaction-service!475
2024-08-09 10:30:23 +02:00
Maverick Studer
e663fd2f2a RED-9857: Add new date format 2024-08-09 10:30:22 +02:00
Dominique Eifländer
0ef4087b36 Merge branch 'RED-9760-anyheadline-4.1' into 'release/4.348.x'
RED-9760: Changed anyHeadlineContains to act like in the previous version

See merge request redactmanager/redaction-service!473
2024-08-07 15:17:33 +02:00
Dominique Eifländer
bf3ae1606b RED-9760: Changed anyHeadlineContains to act like in the previous version 2024-08-07 14:56:16 +02:00
Maverick Studer
43620f7b52 Merge branch 'RED-9782' into 'release/4.348.x'
RED-9782: Automated Analysis should be disabled when uploading a document that...

See merge request redactmanager/redaction-service!471
2024-08-07 12:26:03 +02:00
Maverick Studer
92fc003576 RED-9782: Automated Analysis should be disabled when uploading a document that... 2024-08-07 12:26:02 +02:00
Dominique Eifländer
ed02a83289 Merge branch 'RED-9782-4.1' into 'release/4.348.x'
Resolve RED-9782 "4.1"

See merge request redactmanager/redaction-service!469
2024-08-02 14:40:19 +02:00
Dominique Eifländer
78f5aaa54e Resolve RED-9782 "4.1" 2024-08-02 14:40:19 +02:00
Andrei Isvoran
acb5b4c308 Merge branch 'RED-9770' into 'release/4.348.x'
RED-9770 - Extend date converter

See merge request redactmanager/redaction-service!467
2024-07-30 10:29:48 +02:00
Andrei Isvoran
61ee1c12ca RED-9770 - Extend date converter 2024-07-30 10:56:58 +03:00
Kilian Schüttler
abec7ae6bf Merge branch 'annotationMode-bp' into 'release/4.348.x'
annotationMode: ignore IDs of manual adds in annotationMode

See merge request redactmanager/redaction-service!466
2024-07-26 14:53:41 +02:00
Kilian Schuettler
afeddb4d91 annotationMode: ignore IDs of manual adds in annotationMode 2024-07-26 14:03:28 +02:00
Dominique Eifländer
359c237943 Merge branch 'RED-9658-mongo-4.1' into 'release/4.348.x'
RED-9658: Fixed wrong mongo database name

See merge request redactmanager/redaction-service!461
2024-07-17 11:02:22 +02:00
Dominique Eifländer
9789943f45 RED-9658: Fixed wrong mongo database name 2024-07-17 10:44:30 +02:00
Andrei Isvoran
f096aab156 Merge branch 'RED-9667' into 'release/4.348.x'
RED-9667 - Extend convert dates

See merge request redactmanager/redaction-service!459
2024-07-16 16:02:20 +02:00
Andrei Isvoran
156b102e87 RED-9667 - Extend convert dates 2024-07-16 16:02:20 +02:00
Andrei Isvoran
180728721a Merge branch 'RED-9496-graceful-shutdown-bp' into 'release/4.348.x'
RED-9496 - Implement graceful shutdown

See merge request redactmanager/redaction-service!457
2024-07-04 13:56:45 +02:00
Andrei Isvoran
fb9d1042ac RED-9496 - Implement graceful shutdown 2024-07-04 14:03:43 +03:00
Corina Olariu
046b4b29b9 Merge branch 'RED-9466-bp' into 'release/4.348.x'
RED-9466 - Adding annotation removes all AI based recommendations until forced re-analysis

See merge request redactmanager/redaction-service!452
2024-06-28 15:31:16 +02:00
Corina Olariu
dce797ef8e RED-9466 - Adding annotation removes all AI based recommendations until forced re-analysis 2024-06-28 15:31:16 +02:00
Kilian Schüttler
8b8dab2a18 RED-9375: use storageId for cache names everywhere, such that name may be updated by a user
(cherry picked from commit c1a2e9dee209413ca7a3738dc746ee2397aa1319)
2024-06-27 16:13:24 +02:00
504 changed files with 3822039 additions and 155132 deletions

View File

@ -7,25 +7,20 @@ include:
ref: 'main'
file: 'ci-templates/gradle_java.yml'
publish dependencies:
deploy JavaDoc:
stage: deploy
tags:
- dind
script:
- echo "Publishing dependencies with gradle version ${BUILDVERSION}"
- echo "Building JavaDoc with gradle version ${BUILDVERSION}"
- gradle -Pversion=${BUILDVERSION} publish
- echo "BUILDVERSION=$(echo ${BUILDVERSION})" >> variables.env
artifacts:
reports:
dotenv: variables.env
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_COMMIT_BRANCH =~ /^release/
- if: $CI_COMMIT_BRANCH =~ /^feature/
- if: $CI_COMMIT_TAG
generate JavaDoc:
stage: deploy
generateJavaDoc:
stage: build
tags:
- dind
script:
@ -40,39 +35,14 @@ generate JavaDoc:
- if: $CI_COMMIT_TAG
pages:
stage: deploy
stage: build
needs:
- generate JavaDoc
- publish dependencies
- calculate minor version
pages:
path_prefix: "$BUILDVERSION"
- generateJavaDoc
script:
- mkdir public
- mv redaction-service-v1/redaction-service-server-v1/javadoc/* public/
- URL=$(echo $BUILDVERSION | sed -e 's|\.|-|g')
- echo "Deploying to ${CI_PAGES_URL}/${URL}"
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
artifacts:
paths:
- public
publish JavaDoc to azure:
image: rclone/rclone:1.67.0
tags:
- dind
stage: deploy
when: manual
variables:
VERSION_NAME: "latest"
needs:
- generate JavaDoc
script:
- echo "Deploy JavaDoc with version ${VERSION_NAME} to prod"
- rclone delete azurejavadocs:/$RCLONE_CONFIG_AZUREJAVADOCS_CONTAINER/${VERSION_NAME}
- rclone copy redaction-service-v1/redaction-service-server-v1/javadoc/ azurejavadocs:/$RCLONE_CONFIG_AZUREJAVADOCS_CONTAINER/javadoc/${VERSION_NAME}/
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_COMMIT_BRANCH =~ /^release/
- if: $CI_COMMIT_TAG

View File

@ -15,13 +15,8 @@ pmd {
isConsoleOutput = true
}
tasks.checkstyleMain {
exclude("**/data/**") // ignore generated proto files
}
tasks.pmdMain {
pmd.ruleSetFiles = files("${rootDir}/config/pmd/pmd.xml")
exclude("**/data/**") // ignore generated proto files
}
tasks.pmdTest {
@ -33,8 +28,6 @@ tasks.named<Test>("test") {
reports {
junitXml.outputLocation.set(layout.buildDirectory.dir("reports/junit"))
}
minHeapSize = "512m"
maxHeapSize = "2048m"
}
tasks.test {

View File

@ -9,14 +9,11 @@ gradle assemble
# Get the current Git branch
branch=$(git rev-parse --abbrev-ref HEAD)
# Replace any slashes (e.g., in 'feature/' or 'release/') with a hyphen
cleaned_branch=$(echo "$branch" | sed 's/\//_/g')
# Get the short commit hash (first 5 characters)
commit_hash=$(git rev-parse --short=5 HEAD)
# Combine branch and commit hash
buildName="${USER}-${cleaned_branch}-${commit_hash}"
buildName="${USER}-${branch}-${commit_hash}"
gradle bootBuildImage --publishImage -PbuildbootDockerHostNetwork=true -Pversion=${buildName}

View File

@ -1,35 +0,0 @@
plugins {
id("com.iqser.red.service.java-conventions")
id("io.freefair.lombok") version "8.4"
}
description = "redaction-service-document"
val persistenceServiceVersion = "2.612.0-RED10072.1"
val layoutParserVersion = "newNode"
group = "com.knecon.fforesight"
dependencies {
implementation("com.iqser.red.service:persistence-service-internal-api-v1:${persistenceServiceVersion}")
api("com.google.protobuf:protobuf-java-util:4.28.3")
testImplementation("org.junit.jupiter:junit-jupiter-api:5.8.1")
testRuntimeOnly("org.junit.jupiter:junit-jupiter-engine:5.8.1")
}
publishing {
publications {
create<MavenPublication>(name) {
from(components["java"])
}
}
repositories {
maven {
url = uri("https://nexus.knecon.com/repository/red-platform-releases/")
credentials {
username = providers.gradleProperty("mavenUser").getOrNull();
password = providers.gradleProperty("mavenPassword").getOrNull();
}
}
}
}

View File

@ -1,36 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data;
import static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure;
import java.io.Serializable;
import com.iqser.red.service.redaction.v1.server.data.DocumentPageProto.AllDocumentPages;
import com.iqser.red.service.redaction.v1.server.data.DocumentPositionDataProto.AllDocumentPositionData;
import com.iqser.red.service.redaction.v1.server.data.DocumentTextDataProto.AllDocumentTextData;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentData implements Serializable {
AllDocumentPages documentPages;
AllDocumentTextData documentTextData;
AllDocumentPositionData documentPositionData;
DocumentStructureWrapper documentStructureWrapper;
public DocumentStructure getDocumentStructure() {
return documentStructureWrapper.getDocumentStructure();
}
}

View File

@ -1,694 +0,0 @@
// Generated by the protocol buffer compiler. DO NOT EDIT!
// NO CHECKED-IN PROTOBUF GENCODE
// source: DocumentStructure.proto
// Protobuf Java Version: 4.28.3
package com.iqser.red.service.redaction.v1.server.data;
public final class DocumentStructureProto {
private DocumentStructureProto() {}
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
DocumentStructureProto.class.getName());
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistryLite registry) {
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistry registry) {
registerAllExtensions(
(com.google.protobuf.ExtensionRegistryLite) registry);
}
public interface DocumentStructureOrBuilder extends
// @@protoc_insertion_point(interface_extends:DocumentStructure)
com.google.protobuf.MessageOrBuilder {
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return Whether the root field is set.
*/
boolean hasRoot();
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return The root.
*/
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData getRoot();
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder getRootOrBuilder();
}
/**
* Protobuf type {@code DocumentStructure}
*/
public static final class DocumentStructure extends
com.google.protobuf.GeneratedMessage implements
// @@protoc_insertion_point(message_implements:DocumentStructure)
DocumentStructureOrBuilder {
private static final long serialVersionUID = 0L;
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
DocumentStructure.class.getName());
}
// Use DocumentStructure.newBuilder() to construct.
private DocumentStructure(com.google.protobuf.GeneratedMessage.Builder<?> builder) {
super(builder);
}
private DocumentStructure() {
}
public static final com.google.protobuf.Descriptors.Descriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.internal_static_DocumentStructure_descriptor;
}
@java.lang.Override
protected com.google.protobuf.GeneratedMessage.FieldAccessorTable
internalGetFieldAccessorTable() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.internal_static_DocumentStructure_fieldAccessorTable
.ensureFieldAccessorsInitialized(
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.class, com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.Builder.class);
}
private int bitField0_;
public static final int ROOT_FIELD_NUMBER = 1;
private com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData root_;
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return Whether the root field is set.
*/
@java.lang.Override
public boolean hasRoot() {
return ((bitField0_ & 0x00000001) != 0);
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return The root.
*/
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData getRoot() {
return root_ == null ? com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.getDefaultInstance() : root_;
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder getRootOrBuilder() {
return root_ == null ? com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.getDefaultInstance() : root_;
}
private byte memoizedIsInitialized = -1;
@java.lang.Override
public final boolean isInitialized() {
byte isInitialized = memoizedIsInitialized;
if (isInitialized == 1) return true;
if (isInitialized == 0) return false;
memoizedIsInitialized = 1;
return true;
}
@java.lang.Override
public void writeTo(com.google.protobuf.CodedOutputStream output)
throws java.io.IOException {
if (((bitField0_ & 0x00000001) != 0)) {
output.writeMessage(1, getRoot());
}
getUnknownFields().writeTo(output);
}
@java.lang.Override
public int getSerializedSize() {
int size = memoizedSize;
if (size != -1) return size;
size = 0;
if (((bitField0_ & 0x00000001) != 0)) {
size += com.google.protobuf.CodedOutputStream
.computeMessageSize(1, getRoot());
}
size += getUnknownFields().getSerializedSize();
memoizedSize = size;
return size;
}
@java.lang.Override
public boolean equals(final java.lang.Object obj) {
if (obj == this) {
return true;
}
if (!(obj instanceof com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure)) {
return super.equals(obj);
}
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure other = (com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure) obj;
if (hasRoot() != other.hasRoot()) return false;
if (hasRoot()) {
if (!getRoot()
.equals(other.getRoot())) return false;
}
if (!getUnknownFields().equals(other.getUnknownFields())) return false;
return true;
}
@java.lang.Override
public int hashCode() {
if (memoizedHashCode != 0) {
return memoizedHashCode;
}
int hash = 41;
hash = (19 * hash) + getDescriptor().hashCode();
if (hasRoot()) {
hash = (37 * hash) + ROOT_FIELD_NUMBER;
hash = (53 * hash) + getRoot().hashCode();
}
hash = (29 * hash) + getUnknownFields().hashCode();
memoizedHashCode = hash;
return hash;
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
java.nio.ByteBuffer data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
java.nio.ByteBuffer data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
com.google.protobuf.ByteString data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
com.google.protobuf.ByteString data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(byte[] data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
byte[] data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(java.io.InputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
java.io.InputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseDelimitedFrom(java.io.InputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseDelimitedWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseDelimitedFrom(
java.io.InputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseDelimitedWithIOException(PARSER, input, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
com.google.protobuf.CodedInputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure parseFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input, extensionRegistry);
}
@java.lang.Override
public Builder newBuilderForType() { return newBuilder(); }
public static Builder newBuilder() {
return DEFAULT_INSTANCE.toBuilder();
}
public static Builder newBuilder(com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure prototype) {
return DEFAULT_INSTANCE.toBuilder().mergeFrom(prototype);
}
@java.lang.Override
public Builder toBuilder() {
return this == DEFAULT_INSTANCE
? new Builder() : new Builder().mergeFrom(this);
}
@java.lang.Override
protected Builder newBuilderForType(
com.google.protobuf.GeneratedMessage.BuilderParent parent) {
Builder builder = new Builder(parent);
return builder;
}
/**
* Protobuf type {@code DocumentStructure}
*/
public static final class Builder extends
com.google.protobuf.GeneratedMessage.Builder<Builder> implements
// @@protoc_insertion_point(builder_implements:DocumentStructure)
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructureOrBuilder {
public static final com.google.protobuf.Descriptors.Descriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.internal_static_DocumentStructure_descriptor;
}
@java.lang.Override
protected com.google.protobuf.GeneratedMessage.FieldAccessorTable
internalGetFieldAccessorTable() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.internal_static_DocumentStructure_fieldAccessorTable
.ensureFieldAccessorsInitialized(
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.class, com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.Builder.class);
}
// Construct using com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.newBuilder()
private Builder() {
maybeForceBuilderInitialization();
}
private Builder(
com.google.protobuf.GeneratedMessage.BuilderParent parent) {
super(parent);
maybeForceBuilderInitialization();
}
private void maybeForceBuilderInitialization() {
if (com.google.protobuf.GeneratedMessage
.alwaysUseFieldBuilders) {
getRootFieldBuilder();
}
}
@java.lang.Override
public Builder clear() {
super.clear();
bitField0_ = 0;
root_ = null;
if (rootBuilder_ != null) {
rootBuilder_.dispose();
rootBuilder_ = null;
}
return this;
}
@java.lang.Override
public com.google.protobuf.Descriptors.Descriptor
getDescriptorForType() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.internal_static_DocumentStructure_descriptor;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure getDefaultInstanceForType() {
return com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.getDefaultInstance();
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure build() {
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure result = buildPartial();
if (!result.isInitialized()) {
throw newUninitializedMessageException(result);
}
return result;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure buildPartial() {
com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure result = new com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure(this);
if (bitField0_ != 0) { buildPartial0(result); }
onBuilt();
return result;
}
private void buildPartial0(com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure result) {
int from_bitField0_ = bitField0_;
int to_bitField0_ = 0;
if (((from_bitField0_ & 0x00000001) != 0)) {
result.root_ = rootBuilder_ == null
? root_
: rootBuilder_.build();
to_bitField0_ |= 0x00000001;
}
result.bitField0_ |= to_bitField0_;
}
@java.lang.Override
public Builder mergeFrom(com.google.protobuf.Message other) {
if (other instanceof com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure) {
return mergeFrom((com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure)other);
} else {
super.mergeFrom(other);
return this;
}
}
public Builder mergeFrom(com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure other) {
if (other == com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure.getDefaultInstance()) return this;
if (other.hasRoot()) {
mergeRoot(other.getRoot());
}
this.mergeUnknownFields(other.getUnknownFields());
onChanged();
return this;
}
@java.lang.Override
public final boolean isInitialized() {
return true;
}
@java.lang.Override
public Builder mergeFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
if (extensionRegistry == null) {
throw new java.lang.NullPointerException();
}
try {
boolean done = false;
while (!done) {
int tag = input.readTag();
switch (tag) {
case 0:
done = true;
break;
case 10: {
input.readMessage(
getRootFieldBuilder().getBuilder(),
extensionRegistry);
bitField0_ |= 0x00000001;
break;
} // case 10
default: {
if (!super.parseUnknownField(input, extensionRegistry, tag)) {
done = true; // was an endgroup tag
}
break;
} // default:
} // switch (tag)
} // while (!done)
} catch (com.google.protobuf.InvalidProtocolBufferException e) {
throw e.unwrapIOException();
} finally {
onChanged();
} // finally
return this;
}
private int bitField0_;
private com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData root_;
private com.google.protobuf.SingleFieldBuilder<
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.Builder, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder> rootBuilder_;
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return Whether the root field is set.
*/
public boolean hasRoot() {
return ((bitField0_ & 0x00000001) != 0);
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
* @return The root.
*/
public com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData getRoot() {
if (rootBuilder_ == null) {
return root_ == null ? com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.getDefaultInstance() : root_;
} else {
return rootBuilder_.getMessage();
}
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public Builder setRoot(com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData value) {
if (rootBuilder_ == null) {
if (value == null) {
throw new NullPointerException();
}
root_ = value;
} else {
rootBuilder_.setMessage(value);
}
bitField0_ |= 0x00000001;
onChanged();
return this;
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public Builder setRoot(
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.Builder builderForValue) {
if (rootBuilder_ == null) {
root_ = builderForValue.build();
} else {
rootBuilder_.setMessage(builderForValue.build());
}
bitField0_ |= 0x00000001;
onChanged();
return this;
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public Builder mergeRoot(com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData value) {
if (rootBuilder_ == null) {
if (((bitField0_ & 0x00000001) != 0) &&
root_ != null &&
root_ != com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.getDefaultInstance()) {
getRootBuilder().mergeFrom(value);
} else {
root_ = value;
}
} else {
rootBuilder_.mergeFrom(value);
}
if (root_ != null) {
bitField0_ |= 0x00000001;
onChanged();
}
return this;
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public Builder clearRoot() {
bitField0_ = (bitField0_ & ~0x00000001);
root_ = null;
if (rootBuilder_ != null) {
rootBuilder_.dispose();
rootBuilder_ = null;
}
onChanged();
return this;
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.Builder getRootBuilder() {
bitField0_ |= 0x00000001;
onChanged();
return getRootFieldBuilder().getBuilder();
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
public com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder getRootOrBuilder() {
if (rootBuilder_ != null) {
return rootBuilder_.getMessageOrBuilder();
} else {
return root_ == null ?
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.getDefaultInstance() : root_;
}
}
/**
* <pre>
* The root EntryData represents the Document.
* </pre>
*
* <code>.EntryData root = 1;</code>
*/
private com.google.protobuf.SingleFieldBuilder<
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.Builder, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder>
getRootFieldBuilder() {
if (rootBuilder_ == null) {
rootBuilder_ = new com.google.protobuf.SingleFieldBuilder<
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData.Builder, com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryDataOrBuilder>(
getRoot(),
getParentForChildren(),
isClean());
root_ = null;
}
return rootBuilder_;
}
// @@protoc_insertion_point(builder_scope:DocumentStructure)
}
// @@protoc_insertion_point(class_scope:DocumentStructure)
private static final com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure DEFAULT_INSTANCE;
static {
DEFAULT_INSTANCE = new com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure();
}
public static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure getDefaultInstance() {
return DEFAULT_INSTANCE;
}
private static final com.google.protobuf.Parser<DocumentStructure>
PARSER = new com.google.protobuf.AbstractParser<DocumentStructure>() {
@java.lang.Override
public DocumentStructure parsePartialFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
Builder builder = newBuilder();
try {
builder.mergeFrom(input, extensionRegistry);
} catch (com.google.protobuf.InvalidProtocolBufferException e) {
throw e.setUnfinishedMessage(builder.buildPartial());
} catch (com.google.protobuf.UninitializedMessageException e) {
throw e.asInvalidProtocolBufferException().setUnfinishedMessage(builder.buildPartial());
} catch (java.io.IOException e) {
throw new com.google.protobuf.InvalidProtocolBufferException(e)
.setUnfinishedMessage(builder.buildPartial());
}
return builder.buildPartial();
}
};
public static com.google.protobuf.Parser<DocumentStructure> parser() {
return PARSER;
}
@java.lang.Override
public com.google.protobuf.Parser<DocumentStructure> getParserForType() {
return PARSER;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure getDefaultInstanceForType() {
return DEFAULT_INSTANCE;
}
}
private static final com.google.protobuf.Descriptors.Descriptor
internal_static_DocumentStructure_descriptor;
private static final
com.google.protobuf.GeneratedMessage.FieldAccessorTable
internal_static_DocumentStructure_fieldAccessorTable;
public static com.google.protobuf.Descriptors.FileDescriptor
getDescriptor() {
return descriptor;
}
private static com.google.protobuf.Descriptors.FileDescriptor
descriptor;
static {
java.lang.String[] descriptorData = {
"\n\027DocumentStructure.proto\032\017EntryData.pro" +
"to\"-\n\021DocumentStructure\022\030\n\004root\030\001 \001(\0132\n." +
"EntryDataBH\n.com.iqser.red.service.redac" +
"tion.v1.server.dataB\026DocumentStructurePr" +
"otob\006proto3"
};
descriptor = com.google.protobuf.Descriptors.FileDescriptor
.internalBuildGeneratedFileFrom(descriptorData,
new com.google.protobuf.Descriptors.FileDescriptor[] {
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.getDescriptor(),
});
internal_static_DocumentStructure_descriptor =
getDescriptor().getMessageTypes().get(0);
internal_static_DocumentStructure_fieldAccessorTable = new
com.google.protobuf.GeneratedMessage.FieldAccessorTable(
internal_static_DocumentStructure_descriptor,
new java.lang.String[] { "Root", });
descriptor.resolveAllFeaturesImmutable();
com.iqser.red.service.redaction.v1.server.data.EntryDataProto.getDescriptor();
}
// @@protoc_insertion_point(outer_class_scope)
}

View File

@ -1,115 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data;
import static com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure;
import static com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData;
import java.awt.geom.Rectangle2D;
import java.io.Serializable;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;
import lombok.AllArgsConstructor;
import lombok.Getter;
@Getter
@AllArgsConstructor
public class DocumentStructureWrapper implements Serializable {
private final DocumentStructure documentStructure;
public static class TableProperties implements Serializable {
public static final String NUMBER_OF_ROWS = "numberOfRows";
public static final String NUMBER_OF_COLS = "numberOfCols";
}
public static class ImageProperties implements Serializable {
public static final String TRANSPARENT = "transparent";
public static final String IMAGE_TYPE = "imageType";
public static final String POSITION = "position";
public static final String ID = "id";
public static final String REPRESENTATION_HASH = "representationHash";
}
public static class TableCellProperties implements Serializable {
public static final String B_BOX = "bBox";
public static final String ROW = "row";
public static final String COL = "col";
public static final String HEADER = "header";
}
public static class DuplicateParagraphProperties implements Serializable {
public static final String UNSORTED_TEXTBLOCK_ID = "utbid";
}
public static final String RECTANGLE_DELIMITER = ";";
public static Rectangle2D parseRectangle2D(String bBox) {
List<Float> floats = Arrays.stream(bBox.split(RECTANGLE_DELIMITER))
.map(Float::parseFloat)
.toList();
return new Rectangle2D.Float(floats.get(0), floats.get(1), floats.get(2), floats.get(3));
}
public static double[] parseRepresentationVector(String representationHash) {
String[] stringArray = representationHash.split("[,\\s]+");
double[] doubleArray = new double[stringArray.length];
for (int i = 0; i < stringArray.length; i++) {
doubleArray[i] = Double.parseDouble(stringArray[i]);
}
return doubleArray;
}
public EntryData get(List<Integer> tocId) {
if (tocId.isEmpty()) {
return documentStructure.getRoot();
}
EntryData entry = documentStructure.getRoot().getChildrenList()
.get(tocId.get(0));
for (int id : tocId.subList(1, tocId.size())) {
entry = entry.getChildrenList()
.get(id);
}
return entry;
}
public Stream<EntryData> streamAllEntries() {
return flatten(documentStructure.getRoot());
}
public String toString() {
return String.join("\n",
streamAllEntries().map(EntryData::toString)
.toList());
}
private static Stream<EntryData> flatten(EntryData entry) {
return Stream.concat(Stream.of(entry),
entry.getChildrenList()
.stream()
.flatMap(DocumentStructureWrapper::flatten));
}
}

View File

@ -1,176 +0,0 @@
// Generated by the protocol buffer compiler. DO NOT EDIT!
// NO CHECKED-IN PROTOBUF GENCODE
// source: LayoutEngine.proto
// Protobuf Java Version: 4.28.3
package com.iqser.red.service.redaction.v1.server.data;
public final class LayoutEngineProto {
private LayoutEngineProto() {}
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
LayoutEngineProto.class.getName());
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistryLite registry) {
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistry registry) {
registerAllExtensions(
(com.google.protobuf.ExtensionRegistryLite) registry);
}
/**
* Protobuf enum {@code LayoutEngine}
*/
public enum LayoutEngine
implements com.google.protobuf.ProtocolMessageEnum {
/**
* <code>ALGORITHM = 0;</code>
*/
ALGORITHM(0),
/**
* <code>AI = 1;</code>
*/
AI(1),
/**
* <code>OUTLINE = 2;</code>
*/
OUTLINE(2),
UNRECOGNIZED(-1),
;
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
LayoutEngine.class.getName());
}
/**
* <code>ALGORITHM = 0;</code>
*/
public static final int ALGORITHM_VALUE = 0;
/**
* <code>AI = 1;</code>
*/
public static final int AI_VALUE = 1;
/**
* <code>OUTLINE = 2;</code>
*/
public static final int OUTLINE_VALUE = 2;
public final int getNumber() {
if (this == UNRECOGNIZED) {
throw new java.lang.IllegalArgumentException(
"Can't get the number of an unknown enum value.");
}
return value;
}
/**
* @param value The numeric wire value of the corresponding enum entry.
* @return The enum associated with the given numeric wire value.
* @deprecated Use {@link #forNumber(int)} instead.
*/
@java.lang.Deprecated
public static LayoutEngine valueOf(int value) {
return forNumber(value);
}
/**
* @param value The numeric wire value of the corresponding enum entry.
* @return The enum associated with the given numeric wire value.
*/
public static LayoutEngine forNumber(int value) {
switch (value) {
case 0: return ALGORITHM;
case 1: return AI;
case 2: return OUTLINE;
default: return null;
}
}
public static com.google.protobuf.Internal.EnumLiteMap<LayoutEngine>
internalGetValueMap() {
return internalValueMap;
}
private static final com.google.protobuf.Internal.EnumLiteMap<
LayoutEngine> internalValueMap =
new com.google.protobuf.Internal.EnumLiteMap<LayoutEngine>() {
public LayoutEngine findValueByNumber(int number) {
return LayoutEngine.forNumber(number);
}
};
public final com.google.protobuf.Descriptors.EnumValueDescriptor
getValueDescriptor() {
if (this == UNRECOGNIZED) {
throw new java.lang.IllegalStateException(
"Can't get the descriptor of an unrecognized enum value.");
}
return getDescriptor().getValues().get(ordinal());
}
public final com.google.protobuf.Descriptors.EnumDescriptor
getDescriptorForType() {
return getDescriptor();
}
public static final com.google.protobuf.Descriptors.EnumDescriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.LayoutEngineProto.getDescriptor().getEnumTypes().get(0);
}
private static final LayoutEngine[] VALUES = values();
public static LayoutEngine valueOf(
com.google.protobuf.Descriptors.EnumValueDescriptor desc) {
if (desc.getType() != getDescriptor()) {
throw new java.lang.IllegalArgumentException(
"EnumValueDescriptor is not for this type.");
}
if (desc.getIndex() == -1) {
return UNRECOGNIZED;
}
return VALUES[desc.getIndex()];
}
private final int value;
private LayoutEngine(int value) {
this.value = value;
}
// @@protoc_insertion_point(enum_scope:LayoutEngine)
}
public static com.google.protobuf.Descriptors.FileDescriptor
getDescriptor() {
return descriptor;
}
private static com.google.protobuf.Descriptors.FileDescriptor
descriptor;
static {
java.lang.String[] descriptorData = {
"\n\022LayoutEngine.proto*2\n\014LayoutEngine\022\r\n\t" +
"ALGORITHM\020\000\022\006\n\002AI\020\001\022\013\n\007OUTLINE\020\002BC\n.com." +
"iqser.red.service.redaction.v1.server.da" +
"taB\021LayoutEngineProtob\006proto3"
};
descriptor = com.google.protobuf.Descriptors.FileDescriptor
.internalBuildGeneratedFileFrom(descriptorData,
new com.google.protobuf.Descriptors.FileDescriptor[] {
});
descriptor.resolveAllFeaturesImmutable();
}
// @@protoc_insertion_point(outer_class_scope)
}

View File

@ -1,261 +0,0 @@
// Generated by the protocol buffer compiler. DO NOT EDIT!
// NO CHECKED-IN PROTOBUF GENCODE
// source: NodeType.proto
// Protobuf Java Version: 4.28.3
package com.iqser.red.service.redaction.v1.server.data;
public final class NodeTypeProto {
private NodeTypeProto() {}
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
NodeTypeProto.class.getName());
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistryLite registry) {
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistry registry) {
registerAllExtensions(
(com.google.protobuf.ExtensionRegistryLite) registry);
}
/**
* Protobuf enum {@code NodeType}
*/
public enum NodeType
implements com.google.protobuf.ProtocolMessageEnum {
/**
* <code>DOCUMENT = 0;</code>
*/
DOCUMENT(0),
/**
* <code>SECTION = 1;</code>
*/
SECTION(1),
/**
* <code>SUPER_SECTION = 2;</code>
*/
SUPER_SECTION(2),
/**
* <code>HEADLINE = 3;</code>
*/
HEADLINE(3),
/**
* <code>PARAGRAPH = 4;</code>
*/
PARAGRAPH(4),
/**
* <code>TABLE = 5;</code>
*/
TABLE(5),
/**
* <code>TABLE_CELL = 6;</code>
*/
TABLE_CELL(6),
/**
* <code>IMAGE = 7;</code>
*/
IMAGE(7),
/**
* <code>HEADER = 8;</code>
*/
HEADER(8),
/**
* <code>FOOTER = 9;</code>
*/
FOOTER(9),
/**
* <code>TABLE_OF_CONTENTS = 10;</code>
*/
TABLE_OF_CONTENTS(10),
/**
* <code>TABLE_OF_CONTENTS_ITEM = 11;</code>
*/
TABLE_OF_CONTENTS_ITEM(11),
UNRECOGNIZED(-1),
;
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
NodeType.class.getName());
}
/**
* <code>DOCUMENT = 0;</code>
*/
public static final int DOCUMENT_VALUE = 0;
/**
* <code>SECTION = 1;</code>
*/
public static final int SECTION_VALUE = 1;
/**
* <code>SUPER_SECTION = 2;</code>
*/
public static final int SUPER_SECTION_VALUE = 2;
/**
* <code>HEADLINE = 3;</code>
*/
public static final int HEADLINE_VALUE = 3;
/**
* <code>PARAGRAPH = 4;</code>
*/
public static final int PARAGRAPH_VALUE = 4;
/**
* <code>TABLE = 5;</code>
*/
public static final int TABLE_VALUE = 5;
/**
* <code>TABLE_CELL = 6;</code>
*/
public static final int TABLE_CELL_VALUE = 6;
/**
* <code>IMAGE = 7;</code>
*/
public static final int IMAGE_VALUE = 7;
/**
* <code>HEADER = 8;</code>
*/
public static final int HEADER_VALUE = 8;
/**
* <code>FOOTER = 9;</code>
*/
public static final int FOOTER_VALUE = 9;
/**
* <code>TABLE_OF_CONTENTS = 10;</code>
*/
public static final int TABLE_OF_CONTENTS_VALUE = 10;
/**
* <code>TABLE_OF_CONTENTS_ITEM = 11;</code>
*/
public static final int TABLE_OF_CONTENTS_ITEM_VALUE = 11;
public final int getNumber() {
if (this == UNRECOGNIZED) {
throw new java.lang.IllegalArgumentException(
"Can't get the number of an unknown enum value.");
}
return value;
}
/**
* @param value The numeric wire value of the corresponding enum entry.
* @return The enum associated with the given numeric wire value.
* @deprecated Use {@link #forNumber(int)} instead.
*/
@java.lang.Deprecated
public static NodeType valueOf(int value) {
return forNumber(value);
}
/**
* @param value The numeric wire value of the corresponding enum entry.
* @return The enum associated with the given numeric wire value.
*/
public static NodeType forNumber(int value) {
switch (value) {
case 0: return DOCUMENT;
case 1: return SECTION;
case 2: return SUPER_SECTION;
case 3: return HEADLINE;
case 4: return PARAGRAPH;
case 5: return TABLE;
case 6: return TABLE_CELL;
case 7: return IMAGE;
case 8: return HEADER;
case 9: return FOOTER;
case 10: return TABLE_OF_CONTENTS;
case 11: return TABLE_OF_CONTENTS_ITEM;
default: return null;
}
}
public static com.google.protobuf.Internal.EnumLiteMap<NodeType>
internalGetValueMap() {
return internalValueMap;
}
private static final com.google.protobuf.Internal.EnumLiteMap<
NodeType> internalValueMap =
new com.google.protobuf.Internal.EnumLiteMap<NodeType>() {
public NodeType findValueByNumber(int number) {
return NodeType.forNumber(number);
}
};
public final com.google.protobuf.Descriptors.EnumValueDescriptor
getValueDescriptor() {
if (this == UNRECOGNIZED) {
throw new java.lang.IllegalStateException(
"Can't get the descriptor of an unrecognized enum value.");
}
return getDescriptor().getValues().get(ordinal());
}
public final com.google.protobuf.Descriptors.EnumDescriptor
getDescriptorForType() {
return getDescriptor();
}
public static final com.google.protobuf.Descriptors.EnumDescriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.NodeTypeProto.getDescriptor().getEnumTypes().get(0);
}
private static final NodeType[] VALUES = values();
public static NodeType valueOf(
com.google.protobuf.Descriptors.EnumValueDescriptor desc) {
if (desc.getType() != getDescriptor()) {
throw new java.lang.IllegalArgumentException(
"EnumValueDescriptor is not for this type.");
}
if (desc.getIndex() == -1) {
return UNRECOGNIZED;
}
return VALUES[desc.getIndex()];
}
private final int value;
private NodeType(int value) {
this.value = value;
}
// @@protoc_insertion_point(enum_scope:NodeType)
}
public static com.google.protobuf.Descriptors.FileDescriptor
getDescriptor() {
return descriptor;
}
private static com.google.protobuf.Descriptors.FileDescriptor
descriptor;
static {
java.lang.String[] descriptorData = {
"\n\016NodeType.proto*\306\001\n\010NodeType\022\014\n\010DOCUMEN" +
"T\020\000\022\013\n\007SECTION\020\001\022\021\n\rSUPER_SECTION\020\002\022\014\n\010H" +
"EADLINE\020\003\022\r\n\tPARAGRAPH\020\004\022\t\n\005TABLE\020\005\022\016\n\nT" +
"ABLE_CELL\020\006\022\t\n\005IMAGE\020\007\022\n\n\006HEADER\020\010\022\n\n\006FO" +
"OTER\020\t\022\025\n\021TABLE_OF_CONTENTS\020\n\022\032\n\026TABLE_O" +
"F_CONTENTS_ITEM\020\013B?\n.com.iqser.red.servi" +
"ce.redaction.v1.server.dataB\rNodeTypePro" +
"tob\006proto3"
};
descriptor = com.google.protobuf.Descriptors.FileDescriptor
.internalBuildGeneratedFileFrom(descriptorData,
new com.google.protobuf.Descriptors.FileDescriptor[] {
});
descriptor.resolveAllFeaturesImmutable();
}
// @@protoc_insertion_point(outer_class_scope)
}

View File

@ -1,606 +0,0 @@
// Generated by the protocol buffer compiler. DO NOT EDIT!
// NO CHECKED-IN PROTOBUF GENCODE
// source: Range.proto
// Protobuf Java Version: 4.28.3
package com.iqser.red.service.redaction.v1.server.data;
public final class RangeProto {
private RangeProto() {}
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
RangeProto.class.getName());
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistryLite registry) {
}
public static void registerAllExtensions(
com.google.protobuf.ExtensionRegistry registry) {
registerAllExtensions(
(com.google.protobuf.ExtensionRegistryLite) registry);
}
public interface RangeOrBuilder extends
// @@protoc_insertion_point(interface_extends:Range)
com.google.protobuf.MessageOrBuilder {
/**
* <pre>
* A start index.
* </pre>
*
* <code>int32 start = 1;</code>
* @return The start.
*/
int getStart();
/**
* <pre>
* An end index.
* </pre>
*
* <code>int32 end = 2;</code>
* @return The end.
*/
int getEnd();
}
/**
* Protobuf type {@code Range}
*/
public static final class Range extends
com.google.protobuf.GeneratedMessage implements
// @@protoc_insertion_point(message_implements:Range)
RangeOrBuilder {
private static final long serialVersionUID = 0L;
static {
com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(
com.google.protobuf.RuntimeVersion.RuntimeDomain.PUBLIC,
/* major= */ 4,
/* minor= */ 28,
/* patch= */ 3,
/* suffix= */ "",
Range.class.getName());
}
// Use Range.newBuilder() to construct.
private Range(com.google.protobuf.GeneratedMessage.Builder<?> builder) {
super(builder);
}
private Range() {
}
public static final com.google.protobuf.Descriptors.Descriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.internal_static_Range_descriptor;
}
@java.lang.Override
protected com.google.protobuf.GeneratedMessage.FieldAccessorTable
internalGetFieldAccessorTable() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.internal_static_Range_fieldAccessorTable
.ensureFieldAccessorsInitialized(
com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.class, com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.Builder.class);
}
public static final int START_FIELD_NUMBER = 1;
private int start_ = 0;
/**
* <pre>
* A start index.
* </pre>
*
* <code>int32 start = 1;</code>
* @return The start.
*/
@java.lang.Override
public int getStart() {
return start_;
}
public static final int END_FIELD_NUMBER = 2;
private int end_ = 0;
/**
* <pre>
* An end index.
* </pre>
*
* <code>int32 end = 2;</code>
* @return The end.
*/
@java.lang.Override
public int getEnd() {
return end_;
}
private byte memoizedIsInitialized = -1;
@java.lang.Override
public final boolean isInitialized() {
byte isInitialized = memoizedIsInitialized;
if (isInitialized == 1) return true;
if (isInitialized == 0) return false;
memoizedIsInitialized = 1;
return true;
}
@java.lang.Override
public void writeTo(com.google.protobuf.CodedOutputStream output)
throws java.io.IOException {
if (start_ != 0) {
output.writeInt32(1, start_);
}
if (end_ != 0) {
output.writeInt32(2, end_);
}
getUnknownFields().writeTo(output);
}
@java.lang.Override
public int getSerializedSize() {
int size = memoizedSize;
if (size != -1) return size;
size = 0;
if (start_ != 0) {
size += com.google.protobuf.CodedOutputStream
.computeInt32Size(1, start_);
}
if (end_ != 0) {
size += com.google.protobuf.CodedOutputStream
.computeInt32Size(2, end_);
}
size += getUnknownFields().getSerializedSize();
memoizedSize = size;
return size;
}
@java.lang.Override
public boolean equals(final java.lang.Object obj) {
if (obj == this) {
return true;
}
if (!(obj instanceof com.iqser.red.service.redaction.v1.server.data.RangeProto.Range)) {
return super.equals(obj);
}
com.iqser.red.service.redaction.v1.server.data.RangeProto.Range other = (com.iqser.red.service.redaction.v1.server.data.RangeProto.Range) obj;
if (getStart()
!= other.getStart()) return false;
if (getEnd()
!= other.getEnd()) return false;
if (!getUnknownFields().equals(other.getUnknownFields())) return false;
return true;
}
@java.lang.Override
public int hashCode() {
if (memoizedHashCode != 0) {
return memoizedHashCode;
}
int hash = 41;
hash = (19 * hash) + getDescriptor().hashCode();
hash = (37 * hash) + START_FIELD_NUMBER;
hash = (53 * hash) + getStart();
hash = (37 * hash) + END_FIELD_NUMBER;
hash = (53 * hash) + getEnd();
hash = (29 * hash) + getUnknownFields().hashCode();
memoizedHashCode = hash;
return hash;
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
java.nio.ByteBuffer data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
java.nio.ByteBuffer data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
com.google.protobuf.ByteString data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
com.google.protobuf.ByteString data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(byte[] data)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
byte[] data,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
return PARSER.parseFrom(data, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(java.io.InputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
java.io.InputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseDelimitedFrom(java.io.InputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseDelimitedWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseDelimitedFrom(
java.io.InputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseDelimitedWithIOException(PARSER, input, extensionRegistry);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
com.google.protobuf.CodedInputStream input)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input);
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range parseFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
return com.google.protobuf.GeneratedMessage
.parseWithIOException(PARSER, input, extensionRegistry);
}
@java.lang.Override
public Builder newBuilderForType() { return newBuilder(); }
public static Builder newBuilder() {
return DEFAULT_INSTANCE.toBuilder();
}
public static Builder newBuilder(com.iqser.red.service.redaction.v1.server.data.RangeProto.Range prototype) {
return DEFAULT_INSTANCE.toBuilder().mergeFrom(prototype);
}
@java.lang.Override
public Builder toBuilder() {
return this == DEFAULT_INSTANCE
? new Builder() : new Builder().mergeFrom(this);
}
@java.lang.Override
protected Builder newBuilderForType(
com.google.protobuf.GeneratedMessage.BuilderParent parent) {
Builder builder = new Builder(parent);
return builder;
}
/**
* Protobuf type {@code Range}
*/
public static final class Builder extends
com.google.protobuf.GeneratedMessage.Builder<Builder> implements
// @@protoc_insertion_point(builder_implements:Range)
com.iqser.red.service.redaction.v1.server.data.RangeProto.RangeOrBuilder {
public static final com.google.protobuf.Descriptors.Descriptor
getDescriptor() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.internal_static_Range_descriptor;
}
@java.lang.Override
protected com.google.protobuf.GeneratedMessage.FieldAccessorTable
internalGetFieldAccessorTable() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.internal_static_Range_fieldAccessorTable
.ensureFieldAccessorsInitialized(
com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.class, com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.Builder.class);
}
// Construct using com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.newBuilder()
private Builder() {
}
private Builder(
com.google.protobuf.GeneratedMessage.BuilderParent parent) {
super(parent);
}
@java.lang.Override
public Builder clear() {
super.clear();
bitField0_ = 0;
start_ = 0;
end_ = 0;
return this;
}
@java.lang.Override
public com.google.protobuf.Descriptors.Descriptor
getDescriptorForType() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.internal_static_Range_descriptor;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.RangeProto.Range getDefaultInstanceForType() {
return com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.getDefaultInstance();
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.RangeProto.Range build() {
com.iqser.red.service.redaction.v1.server.data.RangeProto.Range result = buildPartial();
if (!result.isInitialized()) {
throw newUninitializedMessageException(result);
}
return result;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.RangeProto.Range buildPartial() {
com.iqser.red.service.redaction.v1.server.data.RangeProto.Range result = new com.iqser.red.service.redaction.v1.server.data.RangeProto.Range(this);
if (bitField0_ != 0) { buildPartial0(result); }
onBuilt();
return result;
}
private void buildPartial0(com.iqser.red.service.redaction.v1.server.data.RangeProto.Range result) {
int from_bitField0_ = bitField0_;
if (((from_bitField0_ & 0x00000001) != 0)) {
result.start_ = start_;
}
if (((from_bitField0_ & 0x00000002) != 0)) {
result.end_ = end_;
}
}
@java.lang.Override
public Builder mergeFrom(com.google.protobuf.Message other) {
if (other instanceof com.iqser.red.service.redaction.v1.server.data.RangeProto.Range) {
return mergeFrom((com.iqser.red.service.redaction.v1.server.data.RangeProto.Range)other);
} else {
super.mergeFrom(other);
return this;
}
}
public Builder mergeFrom(com.iqser.red.service.redaction.v1.server.data.RangeProto.Range other) {
if (other == com.iqser.red.service.redaction.v1.server.data.RangeProto.Range.getDefaultInstance()) return this;
if (other.getStart() != 0) {
setStart(other.getStart());
}
if (other.getEnd() != 0) {
setEnd(other.getEnd());
}
this.mergeUnknownFields(other.getUnknownFields());
onChanged();
return this;
}
@java.lang.Override
public final boolean isInitialized() {
return true;
}
@java.lang.Override
public Builder mergeFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws java.io.IOException {
if (extensionRegistry == null) {
throw new java.lang.NullPointerException();
}
try {
boolean done = false;
while (!done) {
int tag = input.readTag();
switch (tag) {
case 0:
done = true;
break;
case 8: {
start_ = input.readInt32();
bitField0_ |= 0x00000001;
break;
} // case 8
case 16: {
end_ = input.readInt32();
bitField0_ |= 0x00000002;
break;
} // case 16
default: {
if (!super.parseUnknownField(input, extensionRegistry, tag)) {
done = true; // was an endgroup tag
}
break;
} // default:
} // switch (tag)
} // while (!done)
} catch (com.google.protobuf.InvalidProtocolBufferException e) {
throw e.unwrapIOException();
} finally {
onChanged();
} // finally
return this;
}
private int bitField0_;
private int start_ ;
/**
* <pre>
* A start index.
* </pre>
*
* <code>int32 start = 1;</code>
* @return The start.
*/
@java.lang.Override
public int getStart() {
return start_;
}
/**
* <pre>
* A start index.
* </pre>
*
* <code>int32 start = 1;</code>
* @param value The start to set.
* @return This builder for chaining.
*/
public Builder setStart(int value) {
start_ = value;
bitField0_ |= 0x00000001;
onChanged();
return this;
}
/**
* <pre>
* A start index.
* </pre>
*
* <code>int32 start = 1;</code>
* @return This builder for chaining.
*/
public Builder clearStart() {
bitField0_ = (bitField0_ & ~0x00000001);
start_ = 0;
onChanged();
return this;
}
private int end_ ;
/**
* <pre>
* An end index.
* </pre>
*
* <code>int32 end = 2;</code>
* @return The end.
*/
@java.lang.Override
public int getEnd() {
return end_;
}
/**
* <pre>
* An end index.
* </pre>
*
* <code>int32 end = 2;</code>
* @param value The end to set.
* @return This builder for chaining.
*/
public Builder setEnd(int value) {
end_ = value;
bitField0_ |= 0x00000002;
onChanged();
return this;
}
/**
* <pre>
* An end index.
* </pre>
*
* <code>int32 end = 2;</code>
* @return This builder for chaining.
*/
public Builder clearEnd() {
bitField0_ = (bitField0_ & ~0x00000002);
end_ = 0;
onChanged();
return this;
}
// @@protoc_insertion_point(builder_scope:Range)
}
// @@protoc_insertion_point(class_scope:Range)
private static final com.iqser.red.service.redaction.v1.server.data.RangeProto.Range DEFAULT_INSTANCE;
static {
DEFAULT_INSTANCE = new com.iqser.red.service.redaction.v1.server.data.RangeProto.Range();
}
public static com.iqser.red.service.redaction.v1.server.data.RangeProto.Range getDefaultInstance() {
return DEFAULT_INSTANCE;
}
private static final com.google.protobuf.Parser<Range>
PARSER = new com.google.protobuf.AbstractParser<Range>() {
@java.lang.Override
public Range parsePartialFrom(
com.google.protobuf.CodedInputStream input,
com.google.protobuf.ExtensionRegistryLite extensionRegistry)
throws com.google.protobuf.InvalidProtocolBufferException {
Builder builder = newBuilder();
try {
builder.mergeFrom(input, extensionRegistry);
} catch (com.google.protobuf.InvalidProtocolBufferException e) {
throw e.setUnfinishedMessage(builder.buildPartial());
} catch (com.google.protobuf.UninitializedMessageException e) {
throw e.asInvalidProtocolBufferException().setUnfinishedMessage(builder.buildPartial());
} catch (java.io.IOException e) {
throw new com.google.protobuf.InvalidProtocolBufferException(e)
.setUnfinishedMessage(builder.buildPartial());
}
return builder.buildPartial();
}
};
public static com.google.protobuf.Parser<Range> parser() {
return PARSER;
}
@java.lang.Override
public com.google.protobuf.Parser<Range> getParserForType() {
return PARSER;
}
@java.lang.Override
public com.iqser.red.service.redaction.v1.server.data.RangeProto.Range getDefaultInstanceForType() {
return DEFAULT_INSTANCE;
}
}
private static final com.google.protobuf.Descriptors.Descriptor
internal_static_Range_descriptor;
private static final
com.google.protobuf.GeneratedMessage.FieldAccessorTable
internal_static_Range_fieldAccessorTable;
public static com.google.protobuf.Descriptors.FileDescriptor
getDescriptor() {
return descriptor;
}
private static com.google.protobuf.Descriptors.FileDescriptor
descriptor;
static {
java.lang.String[] descriptorData = {
"\n\013Range.proto\"#\n\005Range\022\r\n\005start\030\001 \001(\005\022\013\n" +
"\003end\030\002 \001(\005B<\n.com.iqser.red.service.reda" +
"ction.v1.server.dataB\nRangeProtob\006proto3"
};
descriptor = com.google.protobuf.Descriptors.FileDescriptor
.internalBuildGeneratedFileFrom(descriptorData,
new com.google.protobuf.Descriptors.FileDescriptor[] {
});
internal_static_Range_descriptor =
getDescriptor().getMessageTypes().get(0);
internal_static_Range_fieldAccessorTable = new
com.google.protobuf.GeneratedMessage.FieldAccessorTable(
internal_static_Range_descriptor,
new java.lang.String[] { "Start", "End", });
descriptor.resolveAllFeaturesImmutable();
}
// @@protoc_insertion_point(outer_class_scope)
}

View File

@ -1,25 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
import java.io.Serializable;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Deprecated
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentPage implements Serializable {
int number;
int height;
int width;
int rotation;
}

View File

@ -1,24 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
import java.io.Serializable;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Deprecated
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentPositionData implements Serializable {
Long id;
int[] stringIdxToPositionIdx;
float[][] positions;
}

View File

@ -1,158 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
import java.awt.geom.Rectangle2D;
import java.io.Serializable;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Stream;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Deprecated
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentStructure implements Serializable {
EntryData root;
public static class TableProperties implements Serializable {
public static final String NUMBER_OF_ROWS = "numberOfRows";
public static final String NUMBER_OF_COLS = "numberOfCols";
}
public static class ImageProperties implements Serializable {
public static final String TRANSPARENT = "transparent";
public static final String IMAGE_TYPE = "imageType";
public static final String POSITION = "position";
public static final String ID = "id";
public static final String REPRESENTATION_HASH = "representationHash";
}
public static class TableCellProperties implements Serializable {
public static final String B_BOX = "bBox";
public static final String ROW = "row";
public static final String COL = "col";
public static final String HEADER = "header";
}
public static class DuplicateParagraphProperties implements Serializable {
public static final String UNSORTED_TEXTBLOCK_ID = "utbid";
}
public static final String RECTANGLE_DELIMITER = ";";
public static Rectangle2D parseRectangle2D(String bBox) {
List<Float> floats = Arrays.stream(bBox.split(RECTANGLE_DELIMITER))
.map(Float::parseFloat)
.toList();
return new Rectangle2D.Float(floats.get(0), floats.get(1), floats.get(2), floats.get(3));
}
public static double[] parseRepresentationVector(String representationHash) {
String[] stringArray = representationHash.split("[,\\s]+");
double[] doubleArray = new double[stringArray.length];
for (int i = 0; i < stringArray.length; i++) {
doubleArray[i] = Double.parseDouble(stringArray[i]);
}
return doubleArray;
}
public EntryData get(List<Integer> tocId) {
if (tocId.isEmpty()) {
return root;
}
EntryData entry = root.children.get(tocId.get(0));
for (int id : tocId.subList(1, tocId.size())) {
entry = entry.children.get(id);
}
return entry;
}
public Stream<EntryData> streamAllEntries() {
return Stream.concat(Stream.of(root), root.children.stream())
.flatMap(DocumentStructure::flatten);
}
public String toString() {
return String.join("\n",
streamAllEntries().map(EntryData::toString)
.toList());
}
private static Stream<EntryData> flatten(EntryData entry) {
return Stream.concat(Stream.of(entry),
entry.children.stream()
.flatMap(DocumentStructure::flatten));
}
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public static class EntryData implements Serializable {
NodeType type;
int[] treeId;
Long[] atomicBlockIds;
Long[] pageNumbers;
Map<String, String> properties;
List<EntryData> children;
Set<LayoutEngine> engines;
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("[");
for (int i : treeId) {
sb.append(i);
sb.append(",");
}
sb.delete(sb.length() - 1, sb.length());
sb.append("]: ");
sb.append(type);
sb.append(" atbs = ");
sb.append(atomicBlockIds.length);
return sb.toString();
}
}
}

View File

@ -1,28 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
import java.io.Serializable;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Deprecated
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentTextData implements Serializable {
Long id;
Long page;
String searchText;
int numberOnPage;
int start;
int end;
int[] lineBreaks;
}

View File

@ -1,8 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
@Deprecated
public enum LayoutEngine {
ALGORITHM,
AI,
OUTLINE
}

View File

@ -1,24 +0,0 @@
package com.iqser.red.service.redaction.v1.server.data.old;
import java.io.Serializable;
import java.util.Locale;
@Deprecated
public enum NodeType implements Serializable {
DOCUMENT,
SECTION,
SUPER_SECTION,
HEADLINE,
PARAGRAPH,
TABLE,
TABLE_CELL,
IMAGE,
HEADER,
FOOTER;
public String toString() {
return this.name().charAt(0) + this.name().substring(1).toLowerCase(Locale.ROOT);
}
}

View File

@ -1,199 +0,0 @@
package com.iqser.red.service.redaction.v1.server.mapper;
import java.awt.geom.Rectangle2D;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import com.iqser.red.service.redaction.v1.server.data.DocumentData;
import com.iqser.red.service.redaction.v1.server.data.DocumentPageProto.AllDocumentPages;
import com.iqser.red.service.redaction.v1.server.data.DocumentPageProto.DocumentPage;
import com.iqser.red.service.redaction.v1.server.data.DocumentPositionDataProto.AllDocumentPositionData;
import com.iqser.red.service.redaction.v1.server.data.DocumentPositionDataProto.DocumentPositionData;
import com.iqser.red.service.redaction.v1.server.data.DocumentPositionDataProto.DocumentPositionData.Position;
import com.iqser.red.service.redaction.v1.server.data.DocumentStructureProto.DocumentStructure;
import com.iqser.red.service.redaction.v1.server.data.DocumentStructureWrapper;
import com.iqser.red.service.redaction.v1.server.data.DocumentTextDataProto.AllDocumentTextData;
import com.iqser.red.service.redaction.v1.server.data.DocumentTextDataProto.DocumentTextData;
import com.iqser.red.service.redaction.v1.server.data.EntryDataProto.EntryData;
import com.iqser.red.service.redaction.v1.server.data.LayoutEngineProto;
import com.iqser.red.service.redaction.v1.server.data.NodeTypeProto;
import com.iqser.red.service.redaction.v1.server.data.RangeProto;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentTree;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.DuplicatedParagraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.NodeType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.AtomicTextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import lombok.experimental.UtilityClass;
@UtilityClass
public class DocumentDataMapper {
public DocumentData toDocumentData(Document document) {
List<DocumentTextData> documentTextData = document.streamTerminalTextBlocksInOrder()
.flatMap(textBlock -> textBlock.getAtomicTextBlocks()
.stream())
.distinct()
.map(DocumentDataMapper::toAtomicTextBlockData)
.toList();
AllDocumentTextData allDocumentTextData = AllDocumentTextData.newBuilder().addAllDocumentTextData(documentTextData).build();
List<DocumentPositionData> atomicPositionBlockData = document.streamTerminalTextBlocksInOrder()
.flatMap(textBlock -> textBlock.getAtomicTextBlocks()
.stream())
.distinct()
.map(DocumentDataMapper::toAtomicPositionBlockData)
.toList();
AllDocumentPositionData allDocumentPositionData = AllDocumentPositionData.newBuilder().addAllDocumentPositionData(atomicPositionBlockData).build();
List<DocumentPage> documentPageData = document.getPages()
.stream()
.sorted(Comparator.comparingInt(Page::getNumber))
.map(DocumentDataMapper::toPageData)
.toList();
AllDocumentPages allDocumentPages = AllDocumentPages.newBuilder().addAllDocumentPages(documentPageData).build();
DocumentStructureWrapper tableOfContentsData = toDocumentTreeData(document.getDocumentTree());
return DocumentData.builder()
.documentTextData(allDocumentTextData)
.documentPositionData(allDocumentPositionData)
.documentPages(allDocumentPages)
.documentStructureWrapper(tableOfContentsData)
.build();
}
private DocumentStructureWrapper toDocumentTreeData(DocumentTree documentTree) {
return new DocumentStructureWrapper(DocumentStructure.newBuilder().setRoot(toEntryData(documentTree.getRoot())).build());
}
private EntryData toEntryData(DocumentTree.Entry entry) {
List<Long> atomicTextBlocks;
if (entry.getNode().isLeaf()) {
atomicTextBlocks = toAtomicTextBlockIds(entry.getNode().getLeafTextBlock());
} else {
atomicTextBlocks = new ArrayList<>();
}
Map<String, String> properties = switch (entry.getType()) {
case TABLE -> PropertiesMapper.buildTableProperties((Table) entry.getNode());
case TABLE_CELL -> PropertiesMapper.buildTableCellProperties((TableCell) entry.getNode());
case IMAGE -> PropertiesMapper.buildImageProperties((Image) entry.getNode());
case PARAGRAPH ->
entry.getNode() instanceof DuplicatedParagraph duplicatedParagraph ? PropertiesMapper.buildDuplicateParagraphProperties(duplicatedParagraph) : new HashMap<>();
default -> new HashMap<>();
};
var documentBuilder = EntryData.newBuilder()
.addAllTreeId(entry.getTreeId())
.addAllChildren(entry.getChildren()
.stream()
.map(DocumentDataMapper::toEntryData)
.toList())
.setType(resolveType(entry.getType()))
.addAllAtomicBlockIds(atomicTextBlocks)
.addAllPageNumbers(entry.getNode().getPages()
.stream()
.map(Page::getNumber)
.map(Integer::longValue)
.toList())
.putAllProperties(properties);
if (entry.getNode() != null) {
documentBuilder.addAllEngines(entry.getNode().getEngines()
.stream()
.map(engine -> LayoutEngineProto.LayoutEngine.valueOf(engine.name()))
.toList());
} else {
documentBuilder.addAllEngines(new HashSet<>(Set.of(LayoutEngineProto.LayoutEngine.ALGORITHM)));
}
return documentBuilder.build();
}
private static NodeTypeProto.NodeType resolveType(NodeType type) {
return NodeTypeProto.NodeType.valueOf(type.name());
}
private List<Long> toAtomicTextBlockIds(TextBlock textBlock) {
return textBlock.getAtomicTextBlocks()
.stream()
.map(AtomicTextBlock::getId)
.toList();
}
private DocumentPage toPageData(Page p) {
return DocumentPage.newBuilder().setRotation(p.getRotation()).setHeight(p.getHeight()).setWidth(p.getWidth()).setNumber(p.getNumber()).build();
}
private DocumentTextData toAtomicTextBlockData(AtomicTextBlock atomicTextBlock) {
return DocumentTextData.newBuilder()
.setId(atomicTextBlock.getId())
.setPage(atomicTextBlock.getPage().getNumber().longValue())
.setSearchText(atomicTextBlock.getSearchText())
.setNumberOnPage(atomicTextBlock.getNumberOnPage())
.setStart(atomicTextBlock.getTextRange().start())
.setEnd(atomicTextBlock.getTextRange().end())
.addAllLineBreaks(atomicTextBlock.getLineBreaks())
.addAllItalicTextRanges(atomicTextBlock.getItalicTextRanges()
.stream()
.map(r -> RangeProto.Range.newBuilder().setStart(r.start()).setEnd(r.end()).build())
.toList())
.addAllBoldTextRanges(atomicTextBlock.getBoldTextRanges()
.stream()
.map(r -> RangeProto.Range.newBuilder().setStart(r.start()).setEnd(r.end()).build())
.toList())
.build();
}
private DocumentPositionData toAtomicPositionBlockData(AtomicTextBlock atomicTextBlock) {
return DocumentPositionData.newBuilder()
.setId(atomicTextBlock.getId())
.addAllPositions(toPositions(atomicTextBlock.getPositions()))
.addAllStringIdxToPositionIdx(atomicTextBlock.getStringIdxToPositionIdx())
.build();
}
private static List<Position> toPositions(List<Rectangle2D> rects) {
List<Position> positions = new ArrayList<>();
for (Rectangle2D rect : rects) {
positions.add(toPosition(rect));
}
return positions;
}
private static Position toPosition(Rectangle2D rect) {
return Position.newBuilder().addValue((float) rect.getMinX()).addValue((float) rect.getMinY()).addValue((float) rect.getWidth()).addValue((float) rect.getHeight()).build();
}
}

View File

@ -1,152 +0,0 @@
package com.iqser.red.service.redaction.v1.server.mapper;
import java.awt.geom.Rectangle2D;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import com.iqser.red.service.redaction.v1.server.data.DocumentStructureWrapper;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.DuplicatedParagraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.AtomicTextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import lombok.experimental.UtilityClass;
@UtilityClass
public class PropertiesMapper {
public static Map<String, String> buildImageProperties(Image image) {
Map<String, String> properties = new HashMap<>();
properties.put(DocumentStructureWrapper.ImageProperties.IMAGE_TYPE, image.getImageType().name());
properties.put(DocumentStructureWrapper.ImageProperties.TRANSPARENT, String.valueOf(image.isTransparent()));
properties.put(DocumentStructureWrapper.ImageProperties.POSITION, toString(image.getPosition()));
properties.put(DocumentStructureWrapper.ImageProperties.ID, image.getId());
properties.put(DocumentStructureWrapper.ImageProperties.REPRESENTATION_HASH, image.getRepresentationHash());
return properties;
}
public static Map<String, String> buildTableCellProperties(TableCell tableCell) {
Map<String, String> properties = new HashMap<>();
properties.put(DocumentStructureWrapper.TableCellProperties.ROW, String.valueOf(tableCell.getRow()));
properties.put(DocumentStructureWrapper.TableCellProperties.COL, String.valueOf(tableCell.getCol()));
properties.put(DocumentStructureWrapper.TableCellProperties.HEADER, String.valueOf(tableCell.isHeader()));
if (tableCell.getPages().size() > 1 || tableCell.getBBox().keySet().size() > 1) {
throw new IllegalArgumentException("TableCell can only occur on a single page!");
}
String bBoxString = toString(tableCell.getBBox()
.get(tableCell.getPages()
.stream()
.findFirst()
.get()));
properties.put(DocumentStructureWrapper.TableCellProperties.B_BOX, bBoxString);
return properties;
}
public static Map<String, String> buildTableProperties(Table table) {
Map<String, String> properties = new HashMap<>();
properties.put(DocumentStructureWrapper.TableProperties.NUMBER_OF_ROWS, String.valueOf(table.getNumberOfRows()));
properties.put(DocumentStructureWrapper.TableProperties.NUMBER_OF_COLS, String.valueOf(table.getNumberOfCols()));
return properties;
}
public static void parseImageProperties(Map<String, String> properties, Image.ImageBuilder<?, ?> builder) {
builder.imageType(parseImageType(properties.get(DocumentStructureWrapper.ImageProperties.IMAGE_TYPE)));
builder.transparent(Boolean.parseBoolean(properties.get(DocumentStructureWrapper.ImageProperties.TRANSPARENT)));
builder.position(DocumentStructureWrapper.parseRectangle2D(properties.get(DocumentStructureWrapper.ImageProperties.POSITION)));
builder.id(properties.get(DocumentStructureWrapper.ImageProperties.ID));
}
public static void parseTableCellProperties(Map<String, String> properties, TableCell.TableCellBuilder<?, ?> builder) {
builder.row(Integer.parseInt(properties.get(DocumentStructureWrapper.TableCellProperties.ROW)));
builder.col(Integer.parseInt(properties.get(DocumentStructureWrapper.TableCellProperties.COL)));
builder.header(Boolean.parseBoolean(properties.get(DocumentStructureWrapper.TableCellProperties.HEADER)));
builder.bBox(DocumentStructureWrapper.parseRectangle2D(properties.get(DocumentStructureWrapper.TableCellProperties.B_BOX)));
}
public static void parseTableProperties(Map<String, String> properties, Table.TableBuilder builder) {
builder.numberOfRows(Integer.parseInt(properties.get(DocumentStructureWrapper.TableProperties.NUMBER_OF_ROWS)));
builder.numberOfCols(Integer.parseInt(properties.get(DocumentStructureWrapper.TableProperties.NUMBER_OF_COLS)));
}
public static Map<String, String> buildDuplicateParagraphProperties(DuplicatedParagraph duplicatedParagraph) {
Map<String, String> properties = new HashMap<>();
properties.put(DocumentStructureWrapper.DuplicateParagraphProperties.UNSORTED_TEXTBLOCK_ID,
Arrays.toString(toAtomicTextBlockIds(duplicatedParagraph.getUnsortedLeafTextBlock())));
return properties;
}
public static boolean isDuplicateParagraph(Map<String, String> properties) {
return properties.containsKey(DocumentStructureWrapper.DuplicateParagraphProperties.UNSORTED_TEXTBLOCK_ID);
}
public static List<Long> getUnsortedTextblockIds(Map<String, String> properties) {
return toLongList(properties.get(DocumentStructureWrapper.DuplicateParagraphProperties.UNSORTED_TEXTBLOCK_ID));
}
public static List<Long> toLongList(String ids) {
return Arrays.stream(ids.substring(1, ids.length() - 1).trim().split(","))
.map(Long::valueOf)
.toList();
}
private static ImageType parseImageType(String imageType) {
try {
return ImageType.valueOf(imageType.toUpperCase(Locale.ROOT));
} catch (IllegalArgumentException e) {
return ImageType.OTHER;
}
}
public static String toString(Rectangle2D rectangle2D) {
return String.format(Locale.US,
"%f%s%f%s%f%s%f",
rectangle2D.getX(),
DocumentStructureWrapper.RECTANGLE_DELIMITER,
rectangle2D.getY(),
DocumentStructureWrapper.RECTANGLE_DELIMITER,
rectangle2D.getWidth(),
DocumentStructureWrapper.RECTANGLE_DELIMITER,
rectangle2D.getHeight());
}
private static Long[] toAtomicTextBlockIds(TextBlock textBlock) {
return textBlock.getAtomicTextBlocks()
.stream()
.map(AtomicTextBlock::getId)
.toArray(Long[]::new);
}
}

View File

@ -1,116 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Footer;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Header;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Headline;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Paragraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Section;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SuperSection;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableOfContents;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableOfContentsItem;
public abstract class AbstractNodeVisitor implements NodeVisitor {
@Override
public void visit(Document document) {
defaultVisit(document);
}
@Override
public void visit(SuperSection superSection) {
defaultVisit(superSection);
}
@Override
public void visit(Section section) {
defaultVisit(section);
}
@Override
public void visit(Headline headline) {
defaultVisit(headline);
}
@Override
public void visit(Paragraph paragraph) {
defaultVisit(paragraph);
}
@Override
public void visit(Footer footer) {
defaultVisit(footer);
}
@Override
public void visit(Header header) {
defaultVisit(header);
}
@Override
public void visit(Image image) {
defaultVisit(image);
}
@Override
public void visit(Table table) {
defaultVisit(table);
}
@Override
public void visit(TableCell tableCell) {
defaultVisit(tableCell);
}
@Override
public void visit(TableOfContents toc) {
defaultVisit(toc);
}
@Override
public void visit(TableOfContentsItem toci) {
defaultVisit(toci);
}
public void visitNodeDefault(SemanticNode node) {
// By default, it does nothing
}
protected void defaultVisit(SemanticNode semanticNode) {
visitNodeDefault(semanticNode);
semanticNode.streamChildren()
.forEach(node -> node.accept(this));
}
}

View File

@ -1,32 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document;
import java.util.HashSet;
import java.util.Set;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
import lombok.Getter;
public class IntersectingNodeVisitor extends AbstractNodeVisitor {
@Getter
private Set<SemanticNode> intersectingNodes;
private final TextRange textRange;
public IntersectingNodeVisitor(TextRange textRange) {
this.textRange = textRange;
this.intersectingNodes = new HashSet<>();
}
@Override
public void visitNodeDefault(SemanticNode node) {
if (textRange.intersects(node.getTextRange())) {
intersectingNodes.add(node);
}
}
}

View File

@ -1,53 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Footer;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Header;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Headline;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Paragraph;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Section;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SuperSection;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableOfContents;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableOfContentsItem;
public interface NodeVisitor {
void visit(Document document);
void visit(SuperSection superSection);
void visit(Section section);
void visit(Headline headline);
void visit(Paragraph paragraph);
void visit(Footer footer);
void visit(Header header);
void visit(Image image);
void visit(Table table);
void visit(TableCell tableCell);
void visit(TableOfContents tableOfContents);
void visit(TableOfContentsItem tableOfContentsItem);
}

View File

@ -1,20 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
@Getter
@RequiredArgsConstructor
public abstract class AbstractRelation implements Relation {
protected final TextEntity a;
protected final TextEntity b;
@Override
public String toString() {
return this.getClass().getSimpleName() + "{" + "a=" + a + ", b=" + b + '}';
}
}

View File

@ -1,18 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
public class Containment extends Intersection {
public Containment(TextEntity container, TextEntity contained) {
super(container, contained);
}
public TextEntity getContainer() {
return a;
}
public TextEntity getContained() {
return b;
}
}

View File

@ -1,25 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
public interface EntityEventListener {
/**
* Invoked when an entity is inserted.
*
* @param entity The entity that was inserted.
*/
void onEntityInserted(IEntity entity);
/**
* Invoked when an entity is updated.
*
* @param entity The entity that was updated.
*/
void onEntityUpdated(IEntity entity);
/**
* Invoked when an entity is removed.
*
* @param entity The entity that was removed.
*/
void onEntityRemoved(IEntity entity);
}

View File

@ -1,10 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
public class Equality extends Containment {
public Equality(TextEntity a, TextEntity b) {
super(a, b);
}
}

View File

@ -1,10 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
public class Intersection extends AbstractRelation {
public Intersection(TextEntity a, TextEntity b) {
super(a, b);
}
}

View File

@ -1,10 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
public interface Relation {
TextEntity getA();
TextEntity getB();
}

View File

@ -1,7 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
public enum LayoutEngine {
ALGORITHM,
AI,
OUTLINE
}

View File

@ -1,101 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import com.iqser.red.service.redaction.v1.server.model.document.NodeVisitor;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.FieldDefaults;
import lombok.experimental.SuperBuilder;
import lombok.extern.slf4j.Slf4j;
/**
* Represents a section within a document, encapsulating both its textual content and semantic structure.
*/
@Slf4j
@Data
@SuperBuilder
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = true)
public class Section extends AbstractSemanticNode {
@Override
public NodeType getType() {
return NodeType.SECTION;
}
/**
* Checks if this section contains any tables.
*
* @return True if the section contains at least one table, false otherwise.
*/
public boolean hasTables() {
return streamAllSubNodesOfType(NodeType.TABLE).findAny()
.isPresent();
}
/**
* Returns the SectionIdentifier from the headline obtained by the getHeadline() method.
*
* @return the SectionIdentifier of the associated Headline
*/
@Override
public SectionIdentifier getSectionIdentifier() {
return getHeadline().getSectionIdentifier();
}
@Override
public String toString() {
return getTreeId() + ": " + NodeType.SECTION + ": " + this.getTextBlock().buildSummary();
}
public Headline getHeadline() {
return streamChildrenOfType(NodeType.HEADLINE)//
.map(node -> (Headline) node)//
.findFirst()//
.orElseGet(() -> getParent().getHeadline());
}
/**
* Checks if any headline within this section or its sub-nodes contains a given string.
*
* @param value The string to search for within headlines, case-sensitive.
* @return True if at least one headline contains the specified string, false otherwise.
*/
public boolean anyHeadlineContainsString(String value) {
return streamAllSubNodesOfType(NodeType.HEADLINE).anyMatch(h -> h.containsString(value));
}
/**
* Checks if any headline within this section or its sub-nodes contains a given string, case-insensitive.
*
* @param value The string to search for within headlines, case-insensitive.
* @return True if at least one headline contains the specified string, false otherwise.
*/
public boolean anyHeadlineContainsStringIgnoreCase(String value) {
return streamAllSubNodesOfType(NodeType.HEADLINE).anyMatch(h -> h.containsStringIgnoreCase(value));
}
@Override
public void accept(NodeVisitor visitor) {
visitor.visit(this);
}
}

View File

@ -1,47 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import com.iqser.red.service.redaction.v1.server.model.document.NodeVisitor;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.FieldDefaults;
import lombok.experimental.SuperBuilder;
@Data
@SuperBuilder
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
@EqualsAndHashCode(callSuper = true)
public class TableOfContents extends AbstractSemanticNode {
@Override
public NodeType getType() {
return NodeType.TABLE_OF_CONTENTS;
}
public Headline getHeadline() {
return streamChildrenOfType(NodeType.HEADLINE).map(node -> (Headline) node)
.findFirst()
.orElseGet(() -> getParent().getHeadline());
}
@Override
public void accept(NodeVisitor visitor) {
visitor.visit(this);
}
@Override
public String toString() {
return getTreeId() + ": " + NodeType.TABLE_OF_CONTENTS_ITEM + ": " + getTextBlock().buildSummary();
}
}

View File

@ -1,57 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import com.iqser.red.service.redaction.v1.server.model.document.NodeVisitor;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.FieldDefaults;
import lombok.experimental.SuperBuilder;
@Data
@SuperBuilder
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
@EqualsAndHashCode(callSuper = true)
public class TableOfContentsItem extends AbstractSemanticNode {
TextBlock leafTextBlock;
@Override
public NodeType getType() {
return NodeType.TABLE_OF_CONTENTS_ITEM;
}
@Override
public boolean isLeaf() {
return true;
}
@Override
public void accept(NodeVisitor visitor) {
visitor.visit(this);
}
@Override
public TextBlock getTextBlock() {
return leafTextBlock;
}
@Override
public String toString() {
return getTreeId() + ": " + NodeType.TABLE_OF_CONTENTS_ITEM + ": " + leafTextBlock.buildSummary();
}
}

View File

@ -1,72 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.textblock;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
import java.util.function.BiConsumer;
import java.util.function.BinaryOperator;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collector;
import java.util.stream.Stream;
import lombok.NoArgsConstructor;
@NoArgsConstructor
public class ConsecutiveTextBlockCollector implements Collector<TextBlock, List<ConcatenatedTextBlock>, List<TextBlock>> {
@Override
public Supplier<List<ConcatenatedTextBlock>> supplier() {
return LinkedList::new;
}
@Override
public BiConsumer<List<ConcatenatedTextBlock>, TextBlock> accumulator() {
return (existingList, textBlock) -> {
if (existingList.isEmpty()) {
ConcatenatedTextBlock ctb = ConcatenatedTextBlock.empty();
ctb.concat(textBlock);
existingList.add(ctb);
return;
}
ConcatenatedTextBlock prevBlock = existingList.get(existingList.size() - 1);
if (prevBlock.getTextRange().end() == textBlock.getTextRange().start()) {
prevBlock.concat(textBlock);
} else {
ConcatenatedTextBlock ctb = ConcatenatedTextBlock.empty();
ctb.concat(textBlock);
existingList.add(ctb);
}
};
}
@Override
public BinaryOperator<List<ConcatenatedTextBlock>> combiner() {
return (list1, list2) -> Stream.concat(list1.stream(), list2.stream())
.toList();
}
@Override
public Function<List<ConcatenatedTextBlock>, List<TextBlock>> finisher() {
return a -> a.stream()
.map(tb -> (TextBlock) tb)
.toList();
}
@Override
public Set<Characteristics> characteristics() {
return Set.of(Characteristics.IDENTITY_FINISH);
}
}

View File

@ -1,25 +0,0 @@
syntax = "proto3";
option java_outer_classname = "DocumentPageProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
message AllDocumentPages {
repeated DocumentPage documentPages = 1;
}
message DocumentPage {
// The page number, starting with 1.
int32 number = 1;
// The page height in PDF user units.
int32 height = 2;
// The page width in PDF user units.
int32 width = 3;
// The page rotation as specified by the PDF.
int32 rotation = 4;
}

View File

@ -1,28 +0,0 @@
syntax = "proto3";
option java_outer_classname = "DocumentPositionDataProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
message AllDocumentPositionData {
repeated DocumentPositionData documentPositionData = 1;
}
message DocumentPositionData {
// Identifier of the text block.
int64 id = 1;
// For each string coordinate in the search text of the text block, the array contains an entry relating the string coordinate to the position coordinate.
// This is required due to the text and position coordinates not being equal.
repeated int32 stringIdxToPositionIdx = 2;
// The bounding box for each glyph as a rectangle. This matrix is of size (n,4), where n is the number of glyphs in the text block.
// The second dimension specifies the rectangle with the value x, y, width, height, with x, y specifying the lower left corner.
// In order to access this information, the stringIdxToPositionIdx array must be used to transform the coordinates.
repeated Position positions = 3;
// Definition of a BoundingBox that contains x, y, width, and height.
message Position {
repeated float value = 1;
}
}

View File

@ -1,12 +0,0 @@
syntax = "proto3";
option java_outer_classname = "DocumentStructureProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
import "EntryData.proto";
message DocumentStructure {
// The root EntryData represents the Document.
EntryData root = 1;
}

View File

@ -1,40 +0,0 @@
syntax = "proto3";
import "Range.proto";
option java_outer_classname = "DocumentTextDataProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
message AllDocumentTextData {
repeated DocumentTextData documentTextData = 1;
}
message DocumentTextData {
// Identifier of the text block.
int64 id = 1;
// The page the text block occurs on.
int64 page = 2;
// The text of the text block.
string searchText = 3;
// Each text block is assigned a number on a page, starting from 0.
int32 numberOnPage = 4;
// The text blocks are ordered, this number represents the start of the text block as a string offset.
int32 start = 5;
// The text blocks are ordered, this number represents the end of the text block as a string offset.
int32 end = 6;
// The line breaks in the text of this semantic node in string offsets. They are exclusive end. At the end of each semantic node there is an implicit linebreak.
repeated int32 lineBreaks = 7;
// The text ranges where the text is italic
repeated Range italicTextRanges = 8;
// The text ranges where the text is bold
repeated Range boldTextRanges = 9;
}

View File

@ -1,30 +0,0 @@
syntax = "proto3";
import "LayoutEngine.proto";
import "NodeType.proto";
option java_outer_classname = "EntryDataProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
message EntryData {
// Type of the semantic node.
NodeType type = 1;
// Specifies the position in the parsed tree structure.
repeated int32 treeId = 2;
// Specifies the text block IDs associated with this semantic node.
repeated int64 atomicBlockIds = 3;
// Specifies the pages this semantic node appears on.
repeated int64 pageNumbers = 4;
// Some semantic nodes have additional information, this information is stored in this Map.
map<string, string> properties = 5;
// All child Entries of this Entry.
repeated EntryData children = 6;
// Describes the origin of the semantic node.
repeated LayoutEngine engines = 7;
}

View File

@ -1,10 +0,0 @@
syntax = "proto3";
option java_outer_classname = "LayoutEngineProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
enum LayoutEngine {
ALGORITHM = 0;
AI = 1;
OUTLINE = 2;
}

View File

@ -1,19 +0,0 @@
syntax = "proto3";
option java_outer_classname = "NodeTypeProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
enum NodeType {
DOCUMENT = 0;
SECTION = 1;
SUPER_SECTION = 2;
HEADLINE = 3;
PARAGRAPH = 4;
TABLE = 5;
TABLE_CELL = 6;
IMAGE = 7;
HEADER = 8;
FOOTER = 9;
TABLE_OF_CONTENTS = 10;
TABLE_OF_CONTENTS_ITEM = 11;
}

View File

@ -1,14 +0,0 @@
syntax = "proto3";
option java_outer_classname = "RangeProto";
option java_package = "com.iqser.red.service.redaction.v1.server.data";
message Range {
// A start index.
int32 start = 1;
// An end index.
int32 end = 2;
}

View File

@ -1,26 +0,0 @@
#!/bin/bash
# Minimum required protoc version
MIN_VERSION="28.3"
# Get the installed protoc version
INSTALLED_VERSION=$(protoc --version | awk '{print $2}')
# Function to compare versions
version_lt() {
[ "$(printf '%s\n' "$1" "$2" | sort -V | head -n1)" != "$1" ]
}
# Check if protoc is installed and meets the minimum version
if ! command -v protoc &> /dev/null; then
echo "Error: protoc is not installed. Please install version $MIN_VERSION or later."
exit 1
fi
if version_lt "$INSTALLED_VERSION" "$MIN_VERSION"; then
echo "Error: protoc version $INSTALLED_VERSION is too old. Please upgrade to version $MIN_VERSION or later."
exit 1
fi
# Generate Java files from proto files
protoc --java_out=../java ./*.proto

View File

@ -1,33 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import static org.junit.jupiter.api.Assertions.assertEquals;
import org.junit.jupiter.api.Test;
import com.iqser.red.service.redaction.v1.server.data.LayoutEngineProto;
public class LayoutEngineMappingTest {
@Test
public void assertAllValuesMatch() {
for (LayoutEngine value : LayoutEngine.values()) {
var engine = LayoutEngineProto.LayoutEngine.valueOf(value.name());
assertEquals(engine.name(), value.name());
}
}
@Test
public void assertAllValuesMatchReverse() {
for (LayoutEngineProto.LayoutEngine value : LayoutEngineProto.LayoutEngine.values()) {
if (value.equals(LayoutEngineProto.LayoutEngine.UNRECOGNIZED)) {
continue;
}
var engine = LayoutEngine.valueOf(value.name());
assertEquals(engine.name(), value.name());
}
}
}

View File

@ -1,33 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import static org.junit.jupiter.api.Assertions.assertEquals;
import org.junit.jupiter.api.Test;
import com.iqser.red.service.redaction.v1.server.data.NodeTypeProto;
public class NodeTypeMappingTest {
@Test
public void assertAllValuesMatch() {
for (NodeType value : NodeType.values()) {
var engine = NodeTypeProto.NodeType.valueOf(value.name());
assertEquals(engine.name(), value.name());
}
}
@Test
public void assertAllValuesMatchReverse() {
for (NodeTypeProto.NodeType value : NodeTypeProto.NodeType.values()) {
if (value.equals(NodeTypeProto.NodeType.UNRECOGNIZED)) {
continue;
}
var engine = NodeType.valueOf(value.name());
assertEquals(engine.name(), value.name());
}
}
}

View File

@ -1,144 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.document.nodes;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
import java.util.List;
import org.junit.jupiter.api.Test;
class SectionIdentifierTest {
@Test
void testSectionIdentifier() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("1.1.2: Headline");
assertEquals(SectionIdentifier.Format.NUMERICAL, identifier.getFormat());
assertEquals(3, identifier.level());
assertEquals(List.of(1, 1, 2), identifier.getIdentifiers());
SectionIdentifier child = SectionIdentifier.asChildOf(identifier);
assertTrue(child.isChildOf(identifier));
SectionIdentifier parent = SectionIdentifier.fromSearchText("1.1: Headline");
assertTrue(parent.isParentOf(identifier));
}
@Test
void testSectionIdentifier2() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("A.1.2: Headline");
assertEquals(SectionIdentifier.Format.ALPHANUMERIC, identifier.getFormat());
assertEquals(3, identifier.level());
assertEquals(List.of(1, 1, 2), identifier.getIdentifiers());
}
@Test
void testSectionIdentifier3() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("D.1.2: Headline");
assertEquals(SectionIdentifier.Format.ALPHANUMERIC, identifier.getFormat());
assertEquals(3, identifier.level());
assertEquals(List.of(4, 1, 2), identifier.getIdentifiers());
}
@Test
void testSectionIdentifier4() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("4.1.2.4: Headline");
assertEquals(SectionIdentifier.Format.NUMERICAL, identifier.getFormat());
assertEquals(4, identifier.level());
assertEquals(List.of(4, 1, 2, 4), identifier.getIdentifiers());
}
@Test
void testSectionIdentifier5() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("D.1.2.4.5: Headline");
assertEquals(SectionIdentifier.Format.ALPHANUMERIC, identifier.getFormat());
assertEquals(4, identifier.level());
assertEquals(List.of(4, 1, 2, 4), identifier.getIdentifiers());
}
@Test
void testSectionIdentifier6() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("d.1.2.4.5: Headline");
assertEquals(SectionIdentifier.Format.ALPHANUMERIC, identifier.getFormat());
assertEquals(4, identifier.level());
assertEquals(List.of(4, 1, 2, 4), identifier.getIdentifiers());
}
@Test
void testSectionIdentifier7() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("4.1.2.4.5: Headline");
assertEquals(SectionIdentifier.Format.NUMERICAL, identifier.getFormat());
assertEquals(4, identifier.level());
assertEquals(List.of(4, 1, 2, 4), identifier.getIdentifiers());
}
@Test
void testFalsePositive111() {
SectionIdentifier identifier = SectionIdentifier.fromSearchText("111: Headline");
assertEquals(SectionIdentifier.Format.NUMERICAL, identifier.getFormat());
assertEquals(1, identifier.level());
}
@Test
public void testParentOf() {
var headline = SectionIdentifier.fromSearchText("1 Did you ever hear the tragedy of Darth Plagueis The Wise?");
var headline1 = SectionIdentifier.fromSearchText("1.0 I thought not. Its not a story the Jedi would tell you.");
var headline2 = SectionIdentifier.fromSearchText("1.1 Its a Sith legend. Darth Plagueis was a Dark Lord of the Sith, ");
var headline3 = SectionIdentifier.fromSearchText("1.2.3 so powerful and so wise he could use the Force to influence the midichlorians to create life…");
var headline4 = SectionIdentifier.fromSearchText("1.2.3.4 He had such a knowledge of the dark side that he could even keep the ones he cared about from dying.");
var headline5 = SectionIdentifier.fromSearchText("1.2.3.4.5 The dark side of the Force is a pathway to many abilities some consider to be unnatural.");
var headline6 = SectionIdentifier.fromSearchText("2.0 He became so powerful…");
var headline7 = SectionIdentifier.fromSearchText("10000.0 the only thing he was afraid of was losing his power,");
var headline8 = SectionIdentifier.fromSearchText("A.0 which eventually, of course, he did.");
var headline9 = SectionIdentifier.fromSearchText("Unfortunately, he taught his apprentice everything he knew, then his apprentice killed him in his sleep.");
var headline10 = SectionIdentifier.fromSearchText("2.1.2 Ironic.");
var headline11 = SectionIdentifier.fromSearchText("2.He could save others from death,");
var headline12 = SectionIdentifier.fromSearchText(" 2. but not himself.");
var paragraph1 = SectionIdentifier.asChildOf(headline);
assertTrue(paragraph1.isChildOf(headline));
assertTrue(headline.isParentOf(paragraph1));
assertFalse(paragraph1.isParentOf(headline));
assertFalse(headline.isParentOf(headline1));
assertTrue(headline.isParentOf(headline2));
assertTrue(headline.isParentOf(headline3));
assertTrue(headline.isParentOf(headline4));
assertTrue(headline.isParentOf(headline5));
assertTrue(headline1.isParentOf(headline2));
assertFalse(headline1.isParentOf(headline1));
assertTrue(headline3.isParentOf(headline4));
assertFalse(headline4.isParentOf(headline5));
assertFalse(headline2.isParentOf(headline3));
assertFalse(headline2.isParentOf(headline4));
assertTrue(headline1.isParentOf(headline3));
assertTrue(headline1.isParentOf(headline4));
assertFalse(headline1.isParentOf(headline6));
assertFalse(headline1.isParentOf(headline7));
assertFalse(headline8.isParentOf(headline1));
assertFalse(headline8.isParentOf(headline2));
assertFalse(headline8.isParentOf(headline3));
assertFalse(headline8.isParentOf(headline4));
assertFalse(headline9.isParentOf(headline9));
assertTrue(headline10.isChildOf(headline11));
assertTrue(headline10.isChildOf(headline12));
}
}

View File

@ -4,7 +4,7 @@ plugins {
}
description = "redaction-service-api-v1"
val persistenceServiceVersion = "2.631.0"
val persistenceServiceVersion = "2.465.60"
dependencies {
implementation("org.springframework:spring-web:6.0.12")

View File

@ -2,18 +2,12 @@ package com.iqser.red.service.redaction.v1.model;
public class QueueNames {
public static final String REDACTION_REQUEST_QUEUE_PREFIX = "redaction_request";
public static final String REDACTION_REQUEST_EXCHANGE = "redaction_request_exchange";
public static final String REDACTION_PRIORITY_REQUEST_QUEUE_PREFIX = "redaction_priority_request";
public static final String REDACTION_PRIORITY_REQUEST_EXCHANGE = "redaction_priority_request_exchange";
public static final String REDACTION_RESPONSE_EXCHANGE = "redaction_response_exchange";
public static final String REDACTION_DLQ = "redaction_error";
public static final String REDACTION_QUEUE = "redactionQueue";
public static final String REDACTION_PRIORITY_QUEUE = "redactionPriorityQueue";
public static final String REDACTION_ANALYSIS_RESPONSE_QUEUE = "redactionAnalysisResponseQueue";
public static final String REDACTION_DQL = "redactionDQL";
public static final String SEARCH_TERM_OCCURRENCES_RESPONSE_EXCHANGE = "search_bulk_local_term_response_exchange";
public static final String SEARCH_BULK_LOCAL_TERM_DLQ = "search_bulk_local_term_error";
public static final String MIGRATION_REQUEST_QUEUE = "migrationQueue";
public static final String MIGRATION_QUEUE = "migrationQueue";
public static final String MIGRATION_RESPONSE_QUEUE = "migrationResponseQueue";
public static final String MIGRATION_DLQ = "migrationDLQ";

View File

@ -12,16 +12,14 @@ plugins {
description = "redaction-service-server-v1"
val layoutParserVersion = "0.193.0"
val layoutParserVersion = "0.142.6"
val jacksonVersion = "2.15.2"
val droolsVersion = "9.44.0.Final"
val pdfBoxVersion = "3.0.0"
val persistenceServiceVersion = "2.641.0"
val llmServiceVersion = "1.20.0-RED10072.2"
val persistenceServiceVersion = "2.465.60"
val springBootStarterVersion = "3.1.5"
val springCloudVersion = "4.0.4"
val testContainersVersion = "1.19.7"
val tomcatVersion = "10.1.18"
configurations {
all {
@ -34,31 +32,22 @@ configurations {
dependencies {
implementation(project(":redaction-service-api-v1")) { exclude(group = "com.iqser.red.service", module = "persistence-service-internal-api-v1") }
implementation(project(":document"))
implementation("com.iqser.red.service:persistence-service-internal-api-v1:${persistenceServiceVersion}") { exclude(group = "org.springframework.boot") }
implementation("com.iqser.red.service:persistence-service-shared-mongo-v1:${persistenceServiceVersion}")
{
exclude(group = "com.knecon.fforesight", module = "tenant-commons")
}
implementation("com.knecon.fforesight:layoutparser-service-internal-api:${layoutParserVersion}")
implementation("com.knecon.fforesight:llm-service-api:${llmServiceVersion}")
implementation("com.iqser.red.commons:spring-commons:6.2.0")
implementation("com.iqser.red.commons:metric-commons:2.3.0")
implementation("com.iqser.red.commons:dictionary-merge-commons:1.5.0")
implementation("com.iqser.red.commons:storage-commons:2.50.0")
implementation("com.knecon.fforesight:tenant-commons:0.31.0")
implementation("com.knecon.fforesight:keycloak-commons:0.30.0") {
exclude(group = "com.knecon.fforesight", module = "tenant-commons")
}
implementation("com.iqser.red.commons:storage-commons:2.45.0")
implementation("com.knecon.fforesight:tenant-commons:0.24.0")
implementation("com.knecon.fforesight:tracing-commons:0.5.0")
implementation("com.knecon.fforesight:lifecycle-commons:0.7.0")
implementation("com.knecon.fforesight:lifecycle-commons:0.6.0")
implementation("com.fasterxml.jackson.module:jackson-module-afterburner:${jacksonVersion}")
implementation("com.fasterxml.jackson.datatype:jackson-datatype-jsr310:${jacksonVersion}")
implementation("org.ahocorasick:ahocorasick:0.9.0")
implementation("com.hankcs:aho-corasick-double-array-trie:1.2.2")
implementation("com.github.roklenarcic:aho-corasick:1.2")
implementation("org.ahocorasick:ahocorasick:0.6.3")
implementation("org.javassist:javassist:3.29.2-GA")
implementation("org.drools:drools-engine:${droolsVersion}")
@ -72,16 +61,8 @@ dependencies {
implementation("org.springframework.boot:spring-boot-starter-cache:${springBootStarterVersion}")
implementation("org.springframework.boot:spring-boot-starter-data-redis:${springBootStarterVersion}")
implementation("org.springframework.boot:spring-boot-starter-websocket:${springBootStarterVersion}")
implementation("org.springframework.security:spring-security-messaging:6.1.3")
implementation("org.apache.tomcat:tomcat-websocket:${tomcatVersion}")
implementation("org.apache.tomcat.embed:tomcat-embed-core:${tomcatVersion}")
implementation("org.liquibase:liquibase-core:4.29.2") // Needed to be set explicit, otherwise spring dependency management sets it to 4.20.0
implementation("org.liquibase.ext:liquibase-mongodb:4.29.2")
implementation("net.logstash.logback:logstash-logback-encoder:7.4")
api("ch.qos.logback:logback-classic")
implementation("ch.qos.logback:logback-classic")
implementation("org.reflections:reflections:0.10.2")
@ -102,12 +83,7 @@ dependencies {
group = "com.iqser.red.service",
module = "persistence-service-shared-api-v1"
)
exclude(
group = "com.knecon.fforesight",
module = "document"
)
}
testImplementation("com.pdftron:PDFNet:10.11.0")
}
dependencyManagement {
@ -132,7 +108,6 @@ tasks.named<BootBuildImage>("bootBuildImage") {
"BPE_APPEND_JAVA_TOOL_OPTIONS",
"-XX:MaxMetaspaceSize=1g -Dfile.encoding=UTF-8 -Dkie.repository.project.cache.size=50 -Dkie.repository.project.versions.cache.size=5"
)
environment.put("BPE_DEFAULT_LANG", "en_US.utf8") // java.text.Normalizer does not care for file.encoding
imageName.set("nexus.knecon.com:5001/red/${project.name}")// must build image with same name always, otherwise the builder will not know which image to use as cache. DO NOT CHANGE!
if (project.hasProperty("buildbootDockerHostNetwork")) {
@ -187,19 +162,15 @@ tasks.register("generateJavaDoc", Javadoc::class) {
dependsOn("compileJava")
dependsOn("delombok")
classpath = project.sourceSets["main"].runtimeClasspath
val documentFiles = fileTree("${project(":document").layout.buildDirectory.get()}/generated/sources/delombok/java/main") {
source = fileTree("${buildDir}/generated/sources/delombok/java/main") {
include(droolsImports)
}
val mainFiles = fileTree("${layout.buildDirectory.get()}/generated/sources/delombok/java/main") {
include(droolsImports)
}
source = documentFiles + mainFiles
setDestinationDir(file(project.findProperty("javadocDestinationDir")?.toString() ?: ""))
destinationDir = file(project.findProperty("javadocDestinationDir")?.toString() ?: "")
options.memberLevel = JavadocMemberLevel.PUBLIC
(options as StandardJavadocDocletOptions).apply {
header = "Redaction Service ${project.version}"
footer = "Redaction Service ${project.version}"
title = "API Documentation for Redaction Service ${project.version}"
}
}

View File

@ -9,7 +9,6 @@ import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;
import org.springframework.boot.autoconfigure.liquibase.LiquibaseAutoConfiguration;
import org.springframework.boot.autoconfigure.mongo.MongoAutoConfiguration;
import org.springframework.boot.autoconfigure.security.servlet.SecurityAutoConfiguration;
import org.springframework.boot.autoconfigure.task.TaskExecutionAutoConfiguration;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cloud.openfeign.EnableFeignClients;
@ -22,7 +21,6 @@ import com.iqser.red.service.dictionarymerge.commons.DictionaryMergeService;
import com.iqser.red.service.persistence.service.v1.api.shared.mongo.SharedMongoAutoConfiguration;
import com.iqser.red.service.redaction.v1.server.client.RulesClient;
import com.iqser.red.storage.commons.StorageAutoConfiguration;
import com.knecon.fforesight.keycloakcommons.DefaultKeyCloakCommonsAutoConfiguration;
import com.knecon.fforesight.lifecyclecommons.LifecycleAutoconfiguration;
import com.knecon.fforesight.mongo.database.commons.MongoDatabaseCommonsAutoConfiguration;
import com.knecon.fforesight.mongo.database.commons.liquibase.EnableMongoLiquibase;
@ -36,7 +34,7 @@ import lombok.extern.slf4j.Slf4j;
@Slf4j
@EnableCaching
@ImportAutoConfiguration({MultiTenancyAutoConfiguration.class, SharedMongoAutoConfiguration.class, DefaultKeyCloakCommonsAutoConfiguration.class, LifecycleAutoconfiguration.class})
@ImportAutoConfiguration({MultiTenancyAutoConfiguration.class, SharedMongoAutoConfiguration.class, LifecycleAutoconfiguration.class})
@Import({MetricsConfiguration.class, StorageAutoConfiguration.class, MongoDatabaseCommonsAutoConfiguration.class})
@EnableFeignClients(basePackageClasses = RulesClient.class)
@EnableConfigurationProperties(RedactionServiceSettings.class)

View File

@ -10,13 +10,11 @@ import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import org.reflections.Reflections;
import org.reflections.scanners.Scanners;
import org.reflections.util.ConfigurationBuilder;
import org.reflections.util.FilterBuilder;
import com.iqser.red.service.redaction.v1.server.model.dictionary.SearchImplementation;
@ -27,8 +25,6 @@ import lombok.extern.slf4j.Slf4j;
public class DeprecatedElementsFinder {
public static final String PACKAGE_NAME = "com.iqser.red.service.redaction.v1.server";
public static final Pattern DATA_PACKAGE = Pattern.compile(".*/data/.*");
private Set<Method> deprecatedMethods;
@Getter
private Map<String, String> deprecatedMethodsSignaturesMap;
@ -47,10 +43,7 @@ public class DeprecatedElementsFinder {
Reflections reflections = new Reflections(new ConfigurationBuilder().forPackage(PACKAGE_NAME)
.setExpandSuperTypes(true)
.setScanners(Scanners.MethodsAnnotated, Scanners.TypesAnnotated, Scanners.SubTypes)
.filterInputsBy(new FilterBuilder().includePackage(PACKAGE_NAME).excludePackage(PACKAGE_NAME + ".data")
// Exclude the generated proto data package
));
.setScanners(Scanners.MethodsAnnotated, Scanners.TypesAnnotated, Scanners.SubTypes));
deprecatedMethods = reflections.get(Scanners.MethodsAnnotated.with(Deprecated.class).as(Method.class));

View File

@ -22,28 +22,18 @@ public class RedactionServiceSettings {
private boolean nerServiceEnabled = true;
private boolean azureNerServiceEnabled;
private boolean llmNerServiceEnabled;
private boolean priorityMode;
private long firstLevelDictionaryCacheMaximumSize = 1000;
private long dictionaryCacheMaximumSize = 100;
private int dictionaryCacheExpireAfterAccessDays = 3;
private int droolsExecutionTimeoutSecs = 600;
private int droolsExecutionTimeoutSecs = 300;
private boolean ruleExecutionSecured = true;
private boolean annotationMode;
private boolean droolsDebug;
private boolean protobufJsonFallback = true;
public int getDroolsExecutionTimeoutSecs(int numberOfPages) {

View File

@ -1,10 +0,0 @@
package com.iqser.red.service.redaction.v1.server.client;
import org.springframework.cloud.openfeign.FeignClient;
import com.iqser.red.service.persistence.service.v1.api.internal.resources.DateFormatsResource;
@FeignClient(name = "DateFormatsResource", url = "${persistence-service.url}")
public interface DateFormatsClient extends DateFormatsResource {
}

View File

@ -15,6 +15,5 @@ public class EntityRecognitionEntity {
private int startOffset;
private int endOffset;
private String type;
private Double confidence;
}

View File

@ -1,42 +0,0 @@
package com.iqser.red.service.redaction.v1.server.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.listener.PatternTopic;
import org.springframework.data.redis.listener.RedisMessageListenerContainer;
import org.springframework.data.redis.listener.adapter.MessageListenerAdapter;
import org.springframework.messaging.simp.SimpMessagingTemplate;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.iqser.red.service.redaction.v1.server.service.websocket.RedisPubsubReceiver;
import lombok.RequiredArgsConstructor;
@Configuration
@RequiredArgsConstructor
public class RedisPubsubConfiguration {
private final SimpMessagingTemplate template;
private final ObjectMapper mapper;
private final RedisConnectionFactory connectionFactory;
@Bean
public RedisPubsubReceiver redisPubsubReceiver() {
return new RedisPubsubReceiver(template, mapper);
}
@Bean
public MessageListenerAdapter redisPubsubListenerAdapter() {
return new MessageListenerAdapter(redisPubsubReceiver(), "receiveMessage");
}
@Bean
public RedisMessageListenerContainer redisPubsubContainer() {
RedisMessageListenerContainer container = new RedisMessageListenerContainer();
container.setConnectionFactory(connectionFactory);
container.addMessageListener(redisPubsubListenerAdapter(), new PatternTopic("redaction-service-websocket-messages"));
return container;
}
}

View File

@ -1,95 +0,0 @@
package com.iqser.red.service.redaction.v1.server.config;
import java.util.Collections;
import java.util.Optional;
import org.apache.tomcat.websocket.server.WsSci;
import org.springframework.boot.web.embedded.tomcat.TomcatContextCustomizer;
import org.springframework.boot.web.embedded.tomcat.TomcatServletWebServerFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.messaging.Message;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.simp.config.ChannelRegistration;
import org.springframework.messaging.simp.config.MessageBrokerRegistry;
import org.springframework.messaging.simp.stomp.StompCommand;
import org.springframework.messaging.simp.stomp.StompHeaderAccessor;
import org.springframework.messaging.support.ChannelInterceptor;
import org.springframework.messaging.support.MessageHeaderAccessor;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.oauth2.server.resource.authentication.BearerTokenAuthenticationToken;
import org.springframework.security.oauth2.server.resource.authentication.JwtAuthenticationToken;
import org.springframework.web.socket.config.annotation.EnableWebSocketMessageBroker;
import org.springframework.web.socket.config.annotation.StompEndpointRegistry;
import org.springframework.web.socket.config.annotation.WebSocketMessageBrokerConfigurer;
import com.knecon.fforesight.keycloakcommons.security.TenantAuthenticationManagerResolver;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Configuration
@EnableWebSocketMessageBroker
@RequiredArgsConstructor
public class WebSocketConfiguration implements WebSocketMessageBrokerConfigurer {
private final TenantAuthenticationManagerResolver tenantAuthenticationManagerResolver;
@Override
public void configureMessageBroker(MessageBrokerRegistry config) {
config.enableSimpleBroker("/topic");
config.setApplicationDestinationPrefixes("/app");
}
@Override
public void registerStompEndpoints(StompEndpointRegistry registry) {
registry.addEndpoint("/api/rules-logging/rulesocket").setAllowedOrigins("*");
}
@Override
public void configureClientInboundChannel(ChannelRegistration registration) {
// https://docs.spring.io/spring-framework/reference/web/websocket/stomp/authentication-token-based.html
registration.interceptors(new ChannelInterceptor() {
@Override
public Message<?> preSend(Message<?> message, MessageChannel channel) {
StompHeaderAccessor accessor = MessageHeaderAccessor.getAccessor(message, StompHeaderAccessor.class);
if (StompCommand.CONNECT.equals(accessor.getCommand())) {
Optional.ofNullable(accessor.getNativeHeader("Authorization"))
.ifPresent(ah -> {
String bearerToken = ah.get(0).replace("Bearer ", "");
log.info("Received bearer token {}", bearerToken);
AuthenticationManager authenticationManager = tenantAuthenticationManagerResolver.resolve(bearerToken);
JwtAuthenticationToken token = (JwtAuthenticationToken) authenticationManager.authenticate(new BearerTokenAuthenticationToken(bearerToken));
accessor.setUser(token);
});
}
return message;
}
});
}
@Bean
public TomcatServletWebServerFactory tomcatContainerFactory() {
TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();
factory.setTomcatContextCustomizers(Collections.singletonList(tomcatContextCustomizer()));
return factory;
}
@Bean
public TomcatContextCustomizer tomcatContextCustomizer() {
return context -> context.addServletContainerInitializer(new WsSci(), null);
}
}

View File

@ -1,88 +0,0 @@
package com.iqser.red.service.redaction.v1.server.config;
import java.util.Optional;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.messaging.Message;
import org.springframework.messaging.simp.SimpMessageType;
import org.springframework.messaging.simp.stomp.StompHeaderAccessor;
import org.springframework.security.config.annotation.web.messaging.MessageSecurityMetadataSourceRegistry;
import org.springframework.security.config.annotation.web.socket.AbstractSecurityWebSocketMessageBrokerConfigurer;
import org.springframework.security.oauth2.server.resource.authentication.JwtAuthenticationToken;
import com.knecon.fforesight.keycloakcommons.security.TokenUtils;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Configuration
public class WebSocketSecurityConfiguration extends AbstractSecurityWebSocketMessageBrokerConfigurer {
@Value("${cors.enabled:false}")
private boolean corsEnabled;
@Override
protected void configureInbound(MessageSecurityMetadataSourceRegistry messages) {
messages.simpTypeMatchers(SimpMessageType.HEARTBEAT, SimpMessageType.UNSUBSCRIBE, SimpMessageType.DISCONNECT)
.permitAll()
.simpTypeMatchers(SimpMessageType.CONNECT)
.anonymous() // this is intended, see WebSocketConfiguration.configureClientInboundChannel
.simpTypeMatchers(SimpMessageType.SUBSCRIBE)
.access("@tenantWebSocketSecurityMatcher.checkCanSubscribeTo(authentication,message)")
.anyMessage()
.denyAll();
}
@Override
protected boolean sameOriginDisabled() {
return corsEnabled;
}
@Bean
public TenantWebSocketSecurityMatcher tenantWebSocketSecurityMatcher() {
return new TenantWebSocketSecurityMatcher();
}
public class TenantWebSocketSecurityMatcher {
public boolean checkCanSubscribeTo(JwtAuthenticationToken authentication, Message<?> message) {
var targetedTenant = extractTenantId(message);
var currentTenant = getCurrentTenant(authentication);
return targetedTenant.isPresent() && currentTenant.isPresent() && currentTenant.get().equals(targetedTenant.get());
}
private Optional<String> getCurrentTenant(JwtAuthenticationToken authentication) {
if (authentication != null && authentication.getToken() != null && authentication.getToken().getTokenValue() != null) {
return Optional.of(TokenUtils.toTenant(authentication.getToken().getTokenValue()));
} else {
return Optional.empty();
}
}
}
private Optional<String> extractTenantId(Message<?> message) {
StompHeaderAccessor sha = StompHeaderAccessor.wrap(message);
String topic = sha.getDestination();
if (topic == null) {
return Optional.empty();
}
String tenant = topic.split("/")[2];
return Optional.of(tenant);
}
}

View File

@ -1,36 +0,0 @@
package com.iqser.red.service.redaction.v1.server.logger;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;
import lombok.ToString;
@AllArgsConstructor
@NoArgsConstructor
@Getter
@ToString
public final class Context {
private String fileId;
private String dossierId;
private String dossierTemplateId;
@Setter
private long ruleVersion;
@Setter
private long dateFormatsVersion;
private int analysisNumber;
private String tenantId;
public Context(String fileId, String dossierId, String dossierTemplateId, long ruleVersion, int analysisNumber, String tenantId) {
this.fileId = fileId;
this.dossierId = dossierId;
this.dossierTemplateId = dossierTemplateId;
this.ruleVersion = ruleVersion;
this.analysisNumber = analysisNumber;
this.tenantId = tenantId;
}
}

View File

@ -1,70 +0,0 @@
package com.iqser.red.service.redaction.v1.server.logger;
import org.kie.api.definition.rule.Rule;
import org.kie.api.event.rule.DefaultRuleRuntimeEventListener;
import org.kie.api.event.rule.ObjectDeletedEvent;
import org.kie.api.event.rule.ObjectInsertedEvent;
import org.kie.api.event.rule.ObjectUpdatedEvent;
import lombok.AllArgsConstructor;
@AllArgsConstructor
public class ObjectTrackingEventListener extends DefaultRuleRuntimeEventListener {
RulesLogger logger;
@Override
public void objectInserted(ObjectInsertedEvent event) {
if (!logger.isObjectTrackingActive()) {
return;
}
if (event.getRule() == null) {
logger.logObjectTracking("ObjectInsertedEvent:{} has been inserted", event.getObject());
return;
}
logger.logObjectTracking("ObjectInsertedEvent:{}: {} has been inserted", formatRuleName(event.getRule()), event.getObject());
}
@Override
public void objectDeleted(ObjectDeletedEvent event) {
if (!logger.isObjectTrackingActive()) {
return;
}
if (event.getRule() == null) {
logger.logObjectTracking("ObjectDeletedEvent: {} has been deleted", event.getOldObject());
return;
}
logger.logObjectTracking("ObjectDeletedEvent: {}: {} has been deleted", formatRuleName(event.getRule()), event.getOldObject());
}
@Override
public void objectUpdated(ObjectUpdatedEvent event) {
if (!logger.isObjectTrackingActive()) {
return;
}
if (event.getRule() == null) {
logger.logObjectTracking("ObjectUpdatedEvent:{} has been updated", event.getObject());
return;
}
logger.logObjectTracking("ObjectUpdatedEvent:{}: {} has been updated", formatRuleName(event.getRule()), event.getObject());
}
public static String formatRuleName(Rule rule) {
String name = rule.getName();
if (name.length() > 20) {
return name.substring(0, 20) + "...";
}
return name;
}
}

View File

@ -1,45 +0,0 @@
package com.iqser.red.service.redaction.v1.server.logger;
import java.time.OffsetDateTime;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.ToString;
@NoArgsConstructor
@AllArgsConstructor
@Data
@Builder
@ToString
public class RuleLogEvent {
private String tenantId;
private String fileId;
private String dossierId;
private String dossierTemplateId;
private long ruleVersion;
private int analysisNumber;
private OffsetDateTime timeStamp;
private LogLevel logLevel;
private String message;
}
@Getter
enum LogLevel {
INFO("INFO"),
WARN("WARN"),
ERROR("ERROR");
private final String name;
LogLevel(String name) {
this.name = name;
}
}

View File

@ -1,174 +0,0 @@
package com.iqser.red.service.redaction.v1.server.logger;
import java.time.OffsetDateTime;
import org.slf4j.helpers.MessageFormatter;
import com.iqser.red.service.redaction.v1.server.service.websocket.WebSocketService;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
/**
* This class provides logging functionality specifically for rules execution
* in a Drools context. It is designed to log messages with different log levels
* (INFO, WARN, ERROR) and formats messages using a placeholder-based approach
* similar to popular logging frameworks like SLF4J. <p>
* <p>
* Log messages can include placeholders (i.e., `{}`), which will be replaced by
* the corresponding arguments when the message is formatted. <p>
* <p>
* Example usage:
* <pre>
* logger.info("Message with placeholder {}", object);
* </pre>
*/
@Slf4j
@RequiredArgsConstructor
public class RulesLogger {
private final WebSocketService webSocketService;
private final Context context;
@Getter
private boolean objectTrackingActive;
@Getter
private boolean agendaTrackingActive;
/**
* Logs a message at the INFO level.
*
* @param message The log message containing optional placeholders (i.e., `{}`).
* @param args The arguments to replace the placeholders in the message.
*/
public void info(String message, Object... args) {
log(LogLevel.INFO, message, args);
}
/**
* Logs a message at the WARN level.
*
* @param message The log message containing optional placeholders (i.e., `{}`).
* @param args The arguments to replace the placeholders in the message.
*/
public void warn(String message, Object... args) {
log(LogLevel.WARN, message, args);
}
/**
* Logs a message at the INFO level, if object tracking has been activated.
*
* @param message The log message containing optional placeholders (i.e., `{}`).
* @param args The arguments to replace the placeholders in the message.
*/
public void logObjectTracking(String message, Object... args) {
if (objectTrackingActive) {
info(message, args);
}
}
/**
* If object tracking is enabled, the RulesLogger will log all inserted/retracted/updated events.
* Initial value is disabled.
*/
public void enableObjectTracking() {
objectTrackingActive = true;
}
/**
* If object tracking is disabled, the RulesLogger won't log any inserted/retracted/updated events.
* Initial value is disabled.
*/
public void disableObjectTracking() {
objectTrackingActive = false;
}
/**
* Logs a message at the INFO level, if agenda tracking has been activated.
*
* @param message The log message containing optional placeholders (i.e., `{}`).
* @param args The arguments to replace the placeholders in the message.
*/
public void logAgendaTracking(String message, Object... args) {
if (agendaTrackingActive) {
info(message, args);
}
}
/**
* If agenda tracking is enabled, the RulesLogger will log each firing Rule with its name, objects and metadata.
* Initial value is disabled.
*/
public void enableAgendaTracking() {
agendaTrackingActive = true;
}
/**
* If agenda tracking is disabled, the RulesLogger won't log any rule firings.
* Initial value is disabled.
*/
public void disableAgendaTracking() {
agendaTrackingActive = false;
}
/**
* Logs a message at the ERROR level, including an exception.
*
* @param throwable The exception to log.
* @param message The log message containing optional placeholders (i.e., `{}`).
* @param args The arguments to replace the placeholders in the message.
*/
public void error(Throwable throwable, String message, Object... args) {
log(LogLevel.ERROR, message + " Exception: " + throwable.toString(), args);
}
private void log(LogLevel logLevel, String message, Object... args) {
var formattedMessage = formatMessage(message, args);
switch (logLevel) {
case INFO -> log.info(message, args);
case WARN -> log.warn(message, args);
case ERROR -> log.error(message, args);
}
var ruleLog = RuleLogEvent.builder()
.tenantId(context.getTenantId())
.ruleVersion(context.getRuleVersion())
.fileId(context.getFileId())
.analysisNumber(context.getAnalysisNumber())
.dossierId(context.getDossierId())
.dossierTemplateId(context.getDossierTemplateId())
.message(formattedMessage)
.logLevel(logLevel)
.timeStamp(OffsetDateTime.now())
.build();
webSocketService.sendLogEvent(ruleLog);
}
private String formatMessage(String message, Object... args) {
return MessageFormatter.arrayFormat(message, args).getMessage();
}
}

View File

@ -1,65 +0,0 @@
package com.iqser.red.service.redaction.v1.server.logger;
import java.util.Map;
import org.kie.api.definition.rule.Rule;
import org.kie.api.event.rule.AfterMatchFiredEvent;
import org.kie.api.event.rule.DefaultAgendaEventListener;
import org.kie.api.event.rule.MatchCreatedEvent;
import lombok.AllArgsConstructor;
@AllArgsConstructor
public class TrackingAgendaEventListener extends DefaultAgendaEventListener {
private RulesLogger logger;
@Override
public void matchCreated(MatchCreatedEvent event) {
if (logger.isAgendaTrackingActive()) {
logger.logAgendaTracking(event.toString());
}
}
@Override
public void afterMatchFired(AfterMatchFiredEvent event) {
if (!logger.isAgendaTrackingActive()) {
return;
}
Rule rule = event.getMatch().getRule();
String ruleName = formatRuleName(rule);
Map<String, Object> ruleMetaDataMap = rule.getMetaData();
StringBuilder sb = new StringBuilder("AfterMatchFiredEvent: " + ruleName);
if (event.getMatch().getObjects() != null && !event.getMatch().getObjects().isEmpty()) {
sb.append(", ").append(event.getMatch().getObjects().size()).append(" objects: ");
for (Object object : event.getMatch().getObjects()) {
sb.append(object).append(", ");
}
sb.delete(sb.length() - 2, sb.length());
}
if (!ruleMetaDataMap.isEmpty()) {
sb.append("\n With [").append(ruleMetaDataMap.size()).append("] meta-data:");
for (String key : ruleMetaDataMap.keySet()) {
sb.append("\n key=").append(key).append(", value=").append(ruleMetaDataMap.get(key));
}
}
logger.logAgendaTracking(sb.toString());
}
public static String formatRuleName(Rule rule) {
return ObjectTrackingEventListener.formatRuleName(rule);
}
}

View File

@ -0,0 +1,438 @@
package com.iqser.red.service.redaction.v1.server.migration;
import java.time.OffsetDateTime;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.AnnotationStatus;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualLegalBasisChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Point;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Service
@RequiredArgsConstructor
public class LegacyRedactionLogMergeService {
private final DictionaryService dictionaryService;
public RedactionLog addManualAddEntriesAndRemoveSkippedImported(RedactionLog redactionLog, ManualRedactions manualRedactions, String dossierTemplateId) {
Set<String> skippedImportedRedactions = new HashSet<>();
log.info("Adding manual add Entries and removing skipped or imported entries");
if (manualRedactions != null) {
var manualRedactionLogEntries = addManualAddEntries(manualRedactions.getEntriesToAdd(), redactionLog.getAnalysisNumber());
redactionLog.getRedactionLogEntry().addAll(manualRedactionLogEntries);
var manualRedactionWrappers = createManualRedactionWrappers(manualRedactions);
for (RedactionLogEntry entry : redactionLog.getRedactionLogEntry()) {
if (entry.isImported()) {
processRedactionLogEntry(manualRedactionWrappers.stream()
.filter(ManualRedactionWrapper::isApproved)
.filter(mr -> entry.getId().equals(mr.getId()))
.collect(Collectors.toList()), entry, dossierTemplateId);
if (!entry.isRedacted()) {
skippedImportedRedactions.add(entry.getId());
}
}
}
}
Set<String> processedIds = new HashSet<>();
redactionLog.getRedactionLogEntry().removeIf(entry -> {
if (entry.isFalsePositive()) {
return true;
}
if (entry.getImportedRedactionIntersections() != null) {
entry.getImportedRedactionIntersections().removeAll(skippedImportedRedactions);
if (!entry.getImportedRedactionIntersections().isEmpty() && (!entry.isImage() || entry.isImage() && !(entry.getType().equals("image") || entry.getType()
.equals("ocr")))) {
return true;
}
}
if (processedIds.contains(entry.getId())) {
log.info("Duplicate annotation found with id {}", entry.getId());
return true;
}
processedIds.add(entry.getId());
return false;
});
return redactionLog;
}
public long getNumberOfAffectedAnnotations(ManualRedactions manualRedactions) {
return createManualRedactionWrappers(manualRedactions).stream()
.map(ManualRedactionWrapper::getId)
.distinct()
.count();
}
private List<ManualRedactionWrapper> createManualRedactionWrappers(ManualRedactions manualRedactions) {
List<ManualRedactionWrapper> manualRedactionWrappers = new ArrayList<>();
manualRedactions.getRecategorizations()
.forEach(item -> {
if (item.getSoftDeletedTime() == null) {
manualRedactionWrappers.add(new ManualRedactionWrapper(item.getAnnotationId(), item.getRequestDate(), item, item.isApproved()));
}
});
manualRedactions.getIdsToRemove()
.forEach(item -> {
if (item.getSoftDeletedTime() == null) {
manualRedactionWrappers.add(new ManualRedactionWrapper(item.getAnnotationId(), item.getRequestDate(), item, item.isApproved()));
}
});
manualRedactions.getForceRedactions()
.forEach(item -> {
if (item.getSoftDeletedTime() == null) {
manualRedactionWrappers.add(new ManualRedactionWrapper(item.getAnnotationId(), item.getRequestDate(), item, item.isApproved()));
}
});
manualRedactions.getLegalBasisChanges()
.forEach(item -> {
if (item.getSoftDeletedTime() == null) {
manualRedactionWrappers.add(new ManualRedactionWrapper(item.getAnnotationId(), item.getRequestDate(), item, item.isApproved()));
}
});
manualRedactions.getResizeRedactions()
.forEach(item -> {
if (item.getSoftDeletedTime() == null) {
manualRedactionWrappers.add(new ManualRedactionWrapper(item.getAnnotationId(), item.getRequestDate(), item, item.isApproved()));
}
});
Collections.sort(manualRedactionWrappers);
return manualRedactionWrappers;
}
private void processRedactionLogEntry(List<ManualRedactionWrapper> manualRedactionWrappers, RedactionLogEntry redactionLogEntry, String dossierTemplateId) {
manualRedactionWrappers.forEach(mrw -> {
Object item = mrw.getItem();
if (item instanceof ManualRecategorization imageRecategorization) {
processManualImageRecategorization(redactionLogEntry, dossierTemplateId, imageRecategorization);
}
if (item instanceof IdRemoval manualRemoval) {
processIdRemoval(redactionLogEntry, manualRemoval);
}
if (item instanceof ManualForceRedaction manualForceRedact) {
processManualForceRedaction(redactionLogEntry, dossierTemplateId, manualForceRedact);
}
if (item instanceof ManualLegalBasisChange manualLegalBasisChange) {
processManualLegalBasisChange(redactionLogEntry, manualLegalBasisChange);
}
if (item instanceof ManualResizeRedaction manualResizeRedact) {
processManualResizeRedaction(redactionLogEntry, manualResizeRedact);
}
});
}
private void processManualImageRecategorization(RedactionLogEntry redactionLogEntry, String dossierTemplateId, ManualRecategorization imageRecategorization) {
String manualOverrideReason = null;
if (imageRecategorization.getStatus().equals(AnnotationStatus.APPROVED)) {
redactionLogEntry.setType(imageRecategorization.getType());
redactionLogEntry.setSection("Image:" + redactionLogEntry.getType());
if (dictionaryService.isHint(imageRecategorization.getType(), dossierTemplateId)) {
redactionLogEntry.setRedacted(false);
redactionLogEntry.setHint(true);
} else {
redactionLogEntry.setHint(false);
}
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", recategorized by manual override");
} else if (imageRecategorization.getStatus().equals(AnnotationStatus.REQUESTED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", requested to recategorize");
}
if (manualOverrideReason != null) {
redactionLogEntry.setReason(manualOverrideReason);
}
redactionLogEntry.getManualChanges()
.add(ManualChange.from(imageRecategorization)
.withManualRedactionType(ManualRedactionType.RECATEGORIZE)
.withChange("type", imageRecategorization.getType())
.withChange("section", imageRecategorization.getSection())
.withChange("legalBasis", imageRecategorization.getLegalBasis())
.withChange("value", imageRecategorization.getValue()));
}
private String mergeReasonIfNecessary(String currentReason, String addition) {
if (currentReason != null) {
if (!currentReason.contains(addition)) {
return currentReason + addition;
}
return currentReason;
} else {
return "";
}
}
private void processIdRemoval(RedactionLogEntry redactionLogEntry, IdRemoval manualRemoval) {
boolean isApprovedRedaction = manualRemoval.getStatus().equals(AnnotationStatus.APPROVED);
if (isApprovedRedaction && manualRemoval.isRemoveFromDictionary() && isBasedOnDictionaryOnly(redactionLogEntry)) {
log.debug("Skipping merge for dictionary-modifying entry");
} else {
String manualOverrideReason = null;
if (isApprovedRedaction) {
redactionLogEntry.setRedacted(false);
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", removed by manual override");
redactionLogEntry.setHint(false);
} else if (manualRemoval.getStatus().equals(AnnotationStatus.REQUESTED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", requested to remove");
}
if (manualOverrideReason != null) {
redactionLogEntry.setReason(manualOverrideReason);
}
}
redactionLogEntry.getManualChanges()
.add(ManualChange.from(manualRemoval)
.withManualRedactionType(manualRemoval.isRemoveFromDictionary() ? ManualRedactionType.REMOVE_FROM_DICTIONARY : ManualRedactionType.REMOVE_LOCALLY));
}
private boolean isBasedOnDictionaryOnly(RedactionLogEntry redactionLogEntry) {
return redactionLogEntry.getEngines().contains(Engine.DICTIONARY) && redactionLogEntry.getEngines().size() == 1;
}
private void processManualForceRedaction(RedactionLogEntry redactionLogEntry, String dossierTemplateId, ManualForceRedaction manualForceRedact) {
String manualOverrideReason = null;
var dictionaryIsHint = dictionaryService.isHint(redactionLogEntry.getType(), dossierTemplateId);
if (manualForceRedact.getStatus().equals(AnnotationStatus.APPROVED)) {
// Forcing a skipped hint should result in a hint
if (dictionaryIsHint) {
redactionLogEntry.setHint(true);
} else {
redactionLogEntry.setRedacted(true);
}
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", forced by manual override");
redactionLogEntry.setLegalBasis(manualForceRedact.getLegalBasis());
} else if (manualForceRedact.getStatus().equals(AnnotationStatus.REQUESTED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", requested to force " + (dictionaryIsHint ? "hint" : "redact"));
redactionLogEntry.setLegalBasis(manualForceRedact.getLegalBasis());
}
if (manualOverrideReason != null) {
redactionLogEntry.setReason(manualOverrideReason);
}
var manualChange = ManualChange.from(manualForceRedact).withManualRedactionType(dictionaryIsHint ? ManualRedactionType.FORCE_HINT : ManualRedactionType.FORCE_REDACT);
redactionLogEntry.getManualChanges().add(manualChange);
}
private void processManualLegalBasisChange(RedactionLogEntry redactionLogEntry, ManualLegalBasisChange manualLegalBasisChange) {
String manualOverrideReason = null;
if (manualLegalBasisChange.getStatus().equals(AnnotationStatus.APPROVED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", legal basis was manually changed");
redactionLogEntry.setLegalBasis(manualLegalBasisChange.getLegalBasis());
redactionLogEntry.setRedacted(true);
if (manualLegalBasisChange.getSection() != null) {
redactionLogEntry.setSection(manualLegalBasisChange.getSection());
}
if (redactionLogEntry.isRectangle() && manualLegalBasisChange.getValue() != null) {
redactionLogEntry.setValue(manualLegalBasisChange.getValue());
}
} else if (manualLegalBasisChange.getStatus().equals(AnnotationStatus.REQUESTED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", legal basis change requested");
}
if (manualOverrideReason != null) {
redactionLogEntry.setReason(manualOverrideReason);
}
var manualChange = ManualChange.from(manualLegalBasisChange).withManualRedactionType(ManualRedactionType.LEGAL_BASIS_CHANGE);
manualChange.withChange("legalBasis", manualLegalBasisChange.getLegalBasis());
if (manualLegalBasisChange.getSection() != null) {
manualChange.withChange("section", manualLegalBasisChange.getSection());
}
if (redactionLogEntry.isRectangle() && manualLegalBasisChange.getValue() != null) {
manualChange.withChange("value", manualLegalBasisChange.getValue());
}
redactionLogEntry.getManualChanges().add(manualChange);
}
private void processManualResizeRedaction(RedactionLogEntry redactionLogEntry, ManualResizeRedaction manualResizeRedact) {
String manualOverrideReason = null;
if (manualResizeRedact.getStatus().equals(AnnotationStatus.APPROVED)) {
redactionLogEntry.setPositions(convertPositions(manualResizeRedact.getPositions()));
if (!"signature".equalsIgnoreCase(redactionLogEntry.getType()) && !"logo".equalsIgnoreCase(redactionLogEntry.getType())) {
redactionLogEntry.setValue(manualResizeRedact.getValue());
}
// This is for backwards compatibility, now the text after/before is calculated during reanalysis because we need to find dict entries on positions where entries are resized to smaller.
if (manualResizeRedact.getTextBefore() != null || manualResizeRedact.getTextAfter() != null) {
redactionLogEntry.setTextBefore(manualResizeRedact.getTextBefore());
redactionLogEntry.setTextAfter(manualResizeRedact.getTextAfter());
}
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", resized by manual override");
} else if (manualResizeRedact.getStatus().equals(AnnotationStatus.REQUESTED)) {
manualOverrideReason = mergeReasonIfNecessary(redactionLogEntry.getReason(), ", requested to resize redact");
redactionLogEntry.setPositions(convertPositions(manualResizeRedact.getPositions()));
// This is for backwards compatibility, now the text after/before is calculated during reanalysis because we need to find dict entries on positions where entries are resized to smaller.
if (manualResizeRedact.getTextBefore() != null || manualResizeRedact.getTextAfter() != null) {
redactionLogEntry.setTextBefore(manualResizeRedact.getTextBefore());
redactionLogEntry.setTextAfter(manualResizeRedact.getTextAfter());
}
}
redactionLogEntry.setReason(manualOverrideReason);
redactionLogEntry.getManualChanges()
.add(ManualChange.from(manualResizeRedact).withManualRedactionType(ManualRedactionType.RESIZE).withChange("value", manualResizeRedact.getValue()));
}
public List<RedactionLogEntry> addManualAddEntries(Set<ManualRedactionEntry> manualAdds, int analysisNumber) {
List<RedactionLogEntry> redactionLogEntries = new ArrayList<>();
for (ManualRedactionEntry manualRedactionEntry : manualAdds) {
if (shouldCreateManualEntry(manualRedactionEntry)) {
RedactionLogEntry redactionLogEntry = createRedactionLogEntry(manualRedactionEntry, manualRedactionEntry.getAnnotationId(), analysisNumber);
redactionLogEntry.setPositions(convertPositions(manualRedactionEntry.getPositions()));
redactionLogEntry.setTextBefore(manualRedactionEntry.getTextBefore());
redactionLogEntry.setTextAfter(manualRedactionEntry.getTextAfter());
redactionLogEntries.add(redactionLogEntry);
}
}
return redactionLogEntries;
}
private List<Rectangle> convertPositions(List<com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle> positions) {
return positions.stream()
.map(pos -> new Rectangle(new Point(pos.getTopLeftX(), pos.getTopLeftY()), pos.getWidth(), pos.getHeight(), pos.getPage()))
.collect(Collectors.toList());
}
@SuppressWarnings("PMD.UselessParentheses")
private boolean shouldCreateManualEntry(ManualRedactionEntry manualRedactionEntry) {
if (!manualRedactionEntry.isApproved()) {
return false;
}
return (!manualRedactionEntry.isAddToDictionary() && !manualRedactionEntry.isAddToDossierDictionary()) || ((manualRedactionEntry.isAddToDictionary()
|| manualRedactionEntry.isAddToDossierDictionary())
&& manualRedactionEntry.getProcessedDate() == null);
}
private RedactionLogEntry createRedactionLogEntry(ManualRedactionEntry manualRedactionEntry, String id, int analysisNumber) {
var addToDictionary = manualRedactionEntry.isAddToDictionary() || manualRedactionEntry.isAddToDossierDictionary();
var change = ManualChange.from(manualRedactionEntry).withManualRedactionType(addToDictionary ? ManualRedactionType.ADD_TO_DICTIONARY : ManualRedactionType.ADD_LOCALLY);
List<ManualChange> changeList = new ArrayList<>();
changeList.add(change);
return RedactionLogEntry.builder()
.id(id)
.reason(manualRedactionEntry.getReason())
.isDictionaryEntry(manualRedactionEntry.isAddToDictionary())
.isDossierDictionaryEntry(manualRedactionEntry.isAddToDossierDictionary())
.legalBasis(manualRedactionEntry.getLegalBasis())
.value(manualRedactionEntry.getValue())
.sourceId(manualRedactionEntry.getSourceId())
.section(manualRedactionEntry.getSection())
.type(manualRedactionEntry.getType())
.redacted(true)
.isHint(false)
.sectionNumber(-1)
.rectangle(manualRedactionEntry.isRectangle())
.manualChanges(changeList)
.changes(List.of(new Change(analysisNumber + 1, ChangeType.ADDED, manualRedactionEntry.getRequestDate())))
.build();
}
@Data
@AllArgsConstructor
private static class ManualRedactionWrapper implements Comparable<ManualRedactionWrapper> {
private String id;
private OffsetDateTime date;
private Object item;
private boolean approved;
@Override
public int compareTo(ManualRedactionWrapper o) {
return this.date.compareTo(o.date);
}
}
}

View File

@ -0,0 +1,75 @@
package com.iqser.red.service.redaction.v1.server.migration;
import java.lang.reflect.Field;
import java.util.Comparator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import lombok.SneakyThrows;
@Service
@SuppressWarnings("PMD")
public class LegacyVersion0MigrationService {
public RedactionLog mergeDuplicateAnnotationIds(RedactionLog redactionLog) {
List<RedactionLogEntry> mergedEntries = new LinkedList<>();
Map<String, List<RedactionLogEntry>> entriesById = redactionLog.getRedactionLogEntry()
.stream()
.collect(Collectors.groupingBy(RedactionLogEntry::getId));
for (List<RedactionLogEntry> entries : entriesById.values()) {
if (entries.isEmpty()) {
continue;
}
if (entries.size() == 1) {
mergedEntries.add(entries.get(0));
continue;
}
List<RedactionLogEntry> sortedEntries = entries.stream()
.sorted(Comparator.comparing(entry -> entry.getChanges()
.get(0).getDateTime()))
.toList();
RedactionLogEntry initialEntry = sortedEntries.get(0);
for (RedactionLogEntry entry : sortedEntries.subList(1, sortedEntries.size())) {
copyNonNullFields(entry, initialEntry);
}
mergedEntries.add(initialEntry);
}
redactionLog.setRedactionLogEntry(mergedEntries);
return redactionLog;
}
@SneakyThrows
public static void copyNonNullFields(RedactionLogEntry source, RedactionLogEntry destination) {
if (source == null || destination == null) {
throw new IllegalArgumentException("Source and destination objects must not be null");
}
Class<?> sourceClass = source.getClass();
Field[] sourceFields = sourceClass.getDeclaredFields();
for (Field field : sourceFields) {
field.setAccessible(true);
Object value = field.get(source);
if (value != null) {
field.set(destination, value);
}
}
}
}

View File

@ -0,0 +1,101 @@
package com.iqser.red.service.redaction.v1.server.migration;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
public class MigrationMapper {
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change toEntityLogChanges(Change change) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change(change.getAnalysisNumber(),
toEntityLogType(change.getType()),
change.getDateTime());
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange toEntityLogManualChanges(com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange manualChange) {
return new com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualChange(toManualRedactionType(manualChange.getManualRedactionType()),
manualChange.getProcessedDate(),
manualChange.getRequestedDate(),
manualChange.getUserId(),
manualChange.getPropertyChanges());
}
public static ChangeType toEntityLogType(com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ChangeType type) {
return switch (type) {
case ADDED -> ChangeType.ADDED;
case REMOVED -> ChangeType.REMOVED;
case CHANGED -> ChangeType.CHANGED;
};
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType toManualRedactionType(ManualRedactionType manualRedactionType) {
return switch (manualRedactionType) {
case ADD_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD;
case ADD_TO_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.ADD_TO_DICTIONARY;
case REMOVE_LOCALLY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE;
case REMOVE_FROM_DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE_FROM_DICTIONARY;
case FORCE_REDACT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE;
case FORCE_HINT -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE;
case RECATEGORIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RECATEGORIZE;
case LEGAL_BASIS_CHANGE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.LEGAL_BASIS_CHANGE;
case RESIZE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.RESIZE;
};
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine toEntityLogEngine(Engine engine) {
return switch (engine) {
case DICTIONARY -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.DICTIONARY;
case NER -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.NER;
case RULE -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.RULE;
};
}
public static Set<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine> getMigratedEngines(RedactionLogEntry entry) {
Set<com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine> engines = new HashSet<>();
if (entry.isImported()) {
engines.add(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.IMPORTED);
}
if (entry.getEngines() == null) {
return engines;
}
entry.getEngines()
.stream()
.map(MigrationMapper::toEntityLogEngine)
.forEach(engines::add);
return engines;
}
public List<ManualChange> migrateManualChanges(List<com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualChange> manualChanges) {
if (manualChanges == null) {
return Collections.emptyList();
}
return manualChanges.stream()
.map(MigrationMapper::toEntityLogManualChanges)
.toList();
}
}

View File

@ -0,0 +1,96 @@
package com.iqser.red.service.redaction.v1.server.migration;
import static com.iqser.red.service.redaction.v1.model.QueueNames.MIGRATION_QUEUE;
import static com.iqser.red.service.redaction.v1.model.QueueNames.MIGRATION_RESPONSE_QUEUE;
import org.springframework.amqp.core.Message;
import org.springframework.amqp.rabbit.annotation.RabbitHandler;
import org.springframework.amqp.rabbit.annotation.RabbitListener;
import org.springframework.amqp.rabbit.core.RabbitTemplate;
import org.springframework.stereotype.Service;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.dossier.file.FileType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import com.iqser.red.service.redaction.v1.model.MigrationRequest;
import com.iqser.red.service.redaction.v1.model.MigrationResponse;
import com.iqser.red.service.redaction.v1.server.model.MigratedEntityLog;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.document.DocumentGraphMapper;
import com.iqser.red.service.redaction.v1.server.storage.RedactionStorageService;
import lombok.AccessLevel;
import lombok.RequiredArgsConstructor;
import lombok.SneakyThrows;
import lombok.experimental.FieldDefaults;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Service
@RequiredArgsConstructor
@FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE)
public class MigrationMessageReceiver {
ObjectMapper objectMapper;
RedactionLogToEntityLogMigrationService redactionLogToEntityLogMigrationService;
RedactionStorageService redactionStorageService;
LegacyRedactionLogMergeService legacyRedactionLogMergeService;
LegacyVersion0MigrationService legacyVersion0MigrationService;
RabbitTemplate rabbitTemplate;
DictionaryService dictionaryService;
@SneakyThrows
@RabbitHandler
@RabbitListener(queues = MIGRATION_QUEUE)
public void receiveMigrationRequest(Message message) {
MigrationRequest migrationRequest = objectMapper.readValue(message.getBody(), MigrationRequest.class);
log.info("--------------------------------------------------------------------");
log.info("Starting redactionLog to entityLog migration for dossierId {} and fileId {}", migrationRequest.getDossierId(), migrationRequest.getFileId());
dictionaryService.updateDictionary(migrationRequest.getDossierTemplateId(), migrationRequest.getDossierId());
Document document = DocumentGraphMapper.toDocumentGraph(redactionStorageService.getDocumentData(migrationRequest.getDossierId(), migrationRequest.getFileId()));
RedactionLog redactionLog = redactionStorageService.getRedactionLog(migrationRequest.getDossierId(), migrationRequest.getFileId());
if (redactionLog.getAnalysisVersion() == 0) {
redactionLog = legacyVersion0MigrationService.mergeDuplicateAnnotationIds(redactionLog);
} else {
redactionLog = legacyRedactionLogMergeService.addManualAddEntriesAndRemoveSkippedImported(redactionLog,
migrationRequest.getManualRedactions(),
migrationRequest.getDossierTemplateId());
}
MigratedEntityLog migratedEntityLog = redactionLogToEntityLogMigrationService.migrate(redactionLog,
document,
migrationRequest.getDossierTemplateId(),
migrationRequest.getManualRedactions(),
migrationRequest.getFileId(),
migrationRequest.getEntitiesWithComments(),
migrationRequest.isFileIsApproved());
log.info("Storing migrated entityLog and ids to migrate in DB for file {}", migrationRequest.getFileId());
redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.ENTITY_LOG, migratedEntityLog.getEntityLog());
redactionStorageService.storeObject(migrationRequest.getDossierId(), migrationRequest.getFileId(), FileType.MIGRATED_IDS, migratedEntityLog.getMigratedIds());
sendFinished(MigrationResponse.builder().dossierId(migrationRequest.getDossierId()).fileId(migrationRequest.getFileId()).build());
log.info("Migrated {} redactionLog entries, found {} annotation ids for migration in the db, {} new manual entries, for dossierId {} and fileId {}",
migratedEntityLog.getEntityLog().getEntityLogEntry().size(),
migratedEntityLog.getMigratedIds().getMappings().size(),
migratedEntityLog.getMigratedIds().getManualRedactionEntriesToAdd().size(),
migrationRequest.getDossierId(),
migrationRequest.getFileId());
log.info("");
}
@SneakyThrows
public void sendFinished(MigrationResponse migrationResponse) {
rabbitTemplate.convertAndSend(MIGRATION_RESPONSE_QUEUE, migrationResponse);
}
}

View File

@ -0,0 +1,360 @@
package com.iqser.red.service.redaction.v1.server.migration;
import java.awt.geom.Rectangle2D;
import java.util.Collection;
import java.util.Collections;
import java.util.Comparator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.springframework.stereotype.Service;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogLegalBasis;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.migration.MigratedIds;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualRedactions;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.IdRemoval;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLog;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogLegalBasis;
import com.iqser.red.service.redaction.v1.server.model.MigratedEntityLog;
import com.iqser.red.service.redaction.v1.server.model.MigrationEntity;
import com.iqser.red.service.redaction.v1.server.model.PrecursorEntity;
import com.iqser.red.service.redaction.v1.server.model.RectangleWithPage;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFindingUtility;
import com.iqser.red.service.redaction.v1.server.service.document.EntityFromPrecursorCreationService;
import com.iqser.red.service.redaction.v1.server.utils.IdBuilder;
import com.iqser.red.service.redaction.v1.server.utils.MigratedIdsCollector;
import lombok.AccessLevel;
import lombok.RequiredArgsConstructor;
import lombok.experimental.FieldDefaults;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Service
@RequiredArgsConstructor
@FieldDefaults(makeFinal = true, level = AccessLevel.PRIVATE)
//TODO: remove this, once the migration is done
public class RedactionLogToEntityLogMigrationService {
private static final double MATCH_THRESHOLD = 10;
EntityFindingUtility entityFindingUtility;
DictionaryService dictionaryService;
ManualChangesApplicationService manualChangesApplicationService;
public MigratedEntityLog migrate(RedactionLog redactionLog,
Document document,
String dossierTemplateId,
ManualRedactions manualRedactions,
String fileId,
Set<String> entitiesWithComments,
boolean fileIsApproved) {
log.info("Migrating entities for file {}", fileId);
List<MigrationEntity> entitiesToMigrate = calculateMigrationEntitiesFromRedactionLog(redactionLog, document, dossierTemplateId, fileId);
MigratedIds migratedIds = entitiesToMigrate.stream()
.collect(new MigratedIdsCollector());
log.info("applying manual changes to migrated entities for file {}", fileId);
applyLocalProcessedManualChanges(entitiesToMigrate, manualRedactions, fileIsApproved);
EntityLog entityLog = new EntityLog();
entityLog.setAnalysisNumber(redactionLog.getAnalysisNumber());
entityLog.setRulesVersion(redactionLog.getRulesVersion());
entityLog.setDictionaryVersion(redactionLog.getDictionaryVersion());
entityLog.setDossierDictionaryVersion(redactionLog.getDossierDictionaryVersion());
entityLog.setLegalBasisVersion(redactionLog.getLegalBasisVersion());
entityLog.setAnalysisVersion(redactionLog.getAnalysisVersion());
entityLog.setLegalBasis(redactionLog.getLegalBasis()
.stream()
.map(RedactionLogToEntityLogMigrationService::toEntityLogLegalBasis)
.toList());
Map<String, String> oldToNewIDMapping = migratedIds.buildOldToNewMapping();
log.info("Writing migrated entities to entityLog for file {}", fileId);
entityLog.setEntityLogEntry(entitiesToMigrate.stream()
.map(migrationEntity -> migrationEntity.toEntityLogEntry(oldToNewIDMapping))
.toList());
if (getNumberOfApprovedEntries(redactionLog, document.getNumberOfPages()) != entityLog.getEntityLogEntry().size()) {
String message = String.format("Not all entities have been found during the migration redactionLog has %d entries and new entityLog %d",
redactionLog.getRedactionLogEntry().size(),
entityLog.getEntityLogEntry().size());
log.error(message);
throw new AssertionError(message);
}
Set<String> entitiesWithUnprocessedChanges = manualRedactions.buildAll()
.stream()
.filter(manualRedaction -> manualRedaction.getProcessedDate() == null)
.map(BaseAnnotation::getAnnotationId)
.collect(Collectors.toSet());
MigratedIds idsToMigrateInDb = entitiesToMigrate.stream()
.filter(migrationEntity -> migrationEntity.hasManualChangesOrComments(entitiesWithComments, entitiesWithUnprocessedChanges))
.filter(m -> !m.getOldId().equals(m.getNewId()))
.collect(new MigratedIdsCollector());
List<ManualRedactionEntry> manualRedactionEntriesToAdd = entitiesToMigrate.stream()
.filter(MigrationEntity::needsManualEntry)
.map(MigrationEntity::buildManualRedactionEntry)
.toList();
idsToMigrateInDb.setManualRedactionEntriesToAdd(manualRedactionEntriesToAdd);
List<String> manualForceRedactionIdsToDelete = entitiesToMigrate.stream()
.filter(MigrationEntity::needsForceDeletion)
.map(MigrationEntity::getNewId)
.toList();
idsToMigrateInDb.setForceRedactionIdsToDelete(manualForceRedactionIdsToDelete);
return new MigratedEntityLog(idsToMigrateInDb, entityLog);
}
private void applyLocalProcessedManualChanges(List<MigrationEntity> entitiesToMigrate, ManualRedactions manualRedactions, boolean fileIsApproved) {
if (manualRedactions == null) {
return;
}
Map<String, List<BaseAnnotation>> manualChangesPerAnnotationId;
if (fileIsApproved) {
manualChangesPerAnnotationId = manualRedactions.buildAll()
.stream()
.filter(manualChange -> (manualChange.getProcessedDate() != null && manualChange.isLocal()) //
// unprocessed dict change of type IdRemoval or ManualResize must be applied for approved documents
|| (manualChange.getProcessedDate() == null && !manualChange.isLocal() //
&& (manualChange instanceof IdRemoval || manualChange instanceof ManualResizeRedaction)))
.map(this::convertPendingDictChangesToLocal)
.collect(Collectors.groupingBy(BaseAnnotation::getAnnotationId));
} else {
manualChangesPerAnnotationId = manualRedactions.buildAll()
.stream()
.filter(manualChange -> manualChange.getProcessedDate() != null)
.filter(BaseAnnotation::isLocal)
.collect(Collectors.groupingBy(BaseAnnotation::getAnnotationId));
}
entitiesToMigrate.forEach(migrationEntity -> migrationEntity.applyManualChanges(manualChangesPerAnnotationId.getOrDefault(migrationEntity.getOldId(),
Collections.emptyList()),
manualChangesApplicationService));
}
private BaseAnnotation convertPendingDictChangesToLocal(BaseAnnotation baseAnnotation) {
if (baseAnnotation.getProcessedDate() != null) {
return baseAnnotation;
}
if (baseAnnotation.isLocal()) {
return baseAnnotation;
}
if (baseAnnotation instanceof ManualResizeRedaction manualResizeRedaction) {
manualResizeRedaction.setAddToAllDossiers(false);
manualResizeRedaction.setUpdateDictionary(false);
} else if (baseAnnotation instanceof IdRemoval idRemoval) {
idRemoval.setRemoveFromAllDossiers(false);
idRemoval.setRemoveFromDictionary(false);
}
return baseAnnotation;
}
private long getNumberOfApprovedEntries(RedactionLog redactionLog, int numberOfPages) {
return redactionLog.getRedactionLogEntry()
.stream()
.filter(redactionLogEntry -> isOnExistingPage(redactionLogEntry, numberOfPages))
.count();
}
private List<MigrationEntity> calculateMigrationEntitiesFromRedactionLog(RedactionLog redactionLog, Document document, String dossierTemplateId, String fileId) {
List<MigrationEntity> images = getImageBasedMigrationEntities(redactionLog, document, fileId, dossierTemplateId);
List<MigrationEntity> textMigrationEntities = getTextBasedMigrationEntities(redactionLog, document, dossierTemplateId, fileId);
return Stream.of(textMigrationEntities.stream(), images.stream())
.flatMap(Function.identity())
.toList();
}
private static EntityLogLegalBasis toEntityLogLegalBasis(RedactionLogLegalBasis redactionLogLegalBasis) {
return new EntityLogLegalBasis(redactionLogLegalBasis.getName(), redactionLogLegalBasis.getDescription(), redactionLogLegalBasis.getReason());
}
private List<MigrationEntity> getImageBasedMigrationEntities(RedactionLog redactionLog, Document document, String fileId, String dossierTemplateId) {
List<Image> images = document.streamAllImages()
.collect(Collectors.toList());
List<RedactionLogEntry> redactionLogImages = redactionLog.getRedactionLogEntry()
.stream()
.filter(RedactionLogEntry::isImage)
.toList();
List<MigrationEntity> migrationEntities = new LinkedList<>();
for (RedactionLogEntry redactionLogImage : redactionLogImages) {
List<RectangleWithPage> imagePositions = redactionLogImage.getPositions()
.stream()
.map(RectangleWithPage::fromRedactionLogRectangle)
.toList();
assert imagePositions.size() == 1;
Optional<Image> optionalClosestImage = images.stream()
.filter(image -> image.onPage(redactionLogImage.getPositions()
.get(0).getPage()))
.min(Comparator.comparingDouble(image -> EntityFindingUtility.calculateDistance(image.getPosition(), imagePositions.get(0).rectangle2D())))
.filter(image -> EntityFindingUtility.calculateDistance(image.getPosition(), imagePositions.get(0).rectangle2D()) <= MATCH_THRESHOLD);
Image closestImage;
if (optionalClosestImage.isEmpty()) { // if no fitting image can be found create a new one with the previous values!
closestImage = buildImageDirectly(document, redactionLogImage);
} else {
closestImage = optionalClosestImage.get();
images.remove(closestImage);
}
String ruleIdentifier;
String reason = Optional.ofNullable(redactionLogImage.getReason())
.orElse("");
if (redactionLogImage.getMatchedRule().isBlank() || redactionLogImage.getMatchedRule() == null) {
ruleIdentifier = "OLDIMG.0.0";
} else {
ruleIdentifier = "OLDIMG." + redactionLogImage.getMatchedRule() + ".0";
}
if (redactionLogImage.lastChangeIsRemoved()) {
closestImage.remove(ruleIdentifier, reason);
} else if (redactionLogImage.isRedacted()) {
closestImage.apply(ruleIdentifier, reason, redactionLogImage.getLegalBasis());
} else {
closestImage.skip(ruleIdentifier, reason);
}
migrationEntities.add(MigrationEntity.fromRedactionLogImage(redactionLogImage, closestImage, fileId, dictionaryService, dossierTemplateId));
}
return migrationEntities;
}
private static Image buildImageDirectly(Document document, RedactionLogEntry redactionLogImage) {
Image image = Image.builder()
.documentTree(document.getDocumentTree())
.imageType(ImageType.fromString(redactionLogImage.getType()))
.transparent(redactionLogImage.isImageHasTransparency())
.page(document.getPages()
.stream()
.filter(p -> p.getNumber() == redactionLogImage.getPositions()
.get(0).getPage())
.findFirst()
.orElseThrow())
.position(toRectangle2D(redactionLogImage.getPositions()
.get(0)))
.build();
List<Integer> treeId = document.getDocumentTree().createNewMainEntryAndReturnId(image);
image.setTreeId(treeId);
image.setId(IdBuilder.buildId(image.getPages(),
image.getBBox().values()
.stream()
.toList(),
"",
""));
return image;
}
private static Rectangle2D toRectangle2D(Rectangle rect) {
return new Rectangle2D.Double(rect.getTopLeft().getX(), rect.getTopLeft().getY(), rect.getWidth(), rect.getHeight());
}
private List<MigrationEntity> getTextBasedMigrationEntities(RedactionLog redactionLog, Document document, String dossierTemplateId, String fileId) {
List<MigrationEntity> entitiesToMigrate = redactionLog.getRedactionLogEntry()
.stream()
.filter(redactionLogEntry -> !redactionLogEntry.isImage())
.filter(redactionLogEntry -> isOnExistingPage(redactionLogEntry, document.getNumberOfPages()))
.map(entry -> MigrationEntity.fromRedactionLogEntry(entry, fileId, dictionaryService, dossierTemplateId))
.toList();
List<PrecursorEntity> precursorEntities = entitiesToMigrate.stream()
.map(MigrationEntity::getPrecursorEntity)
.toList();
log.info("Finding all possible entities");
Map<String, List<TextEntity>> tempEntitiesByValue = entityFindingUtility.findAllPossibleEntitiesAndGroupByValue(document, precursorEntities);
for (MigrationEntity migrationEntity : entitiesToMigrate) {
Optional<TextEntity> optionalTextEntity = entityFindingUtility.findClosestEntityAndReturnEmptyIfNotFound(migrationEntity.getPrecursorEntity(),
tempEntitiesByValue,
MATCH_THRESHOLD);
if (optionalTextEntity.isEmpty()) {
migrationEntity.setMigratedEntity(migrationEntity.getPrecursorEntity());
migrationEntity.setOldId(migrationEntity.getPrecursorEntity().getId());
migrationEntity.setNewId(migrationEntity.getPrecursorEntity().getId());
continue;
}
TextEntity migratedEntity = EntityFromPrecursorCreationService.createCorrectEntity(migrationEntity.getPrecursorEntity(), optionalTextEntity.get(), true);
migrationEntity.setMigratedEntity(migratedEntity);
migrationEntity.setOldId(migrationEntity.getPrecursorEntity().getId());
migrationEntity.setNewId(migratedEntity.getId());
}
tempEntitiesByValue.values()
.stream()
.flatMap(Collection::stream)
.forEach(TextEntity::removeFromGraph);
return entitiesToMigrate;
}
private boolean isOnExistingPage(RedactionLogEntry redactionLogEntry, int numberOfPages) {
var pages = redactionLogEntry.getPositions()
.stream()
.map(Rectangle::getPage)
.collect(Collectors.toSet());
for (int page : pages) {
if (page > numberOfPages) {
return false;
}
}
return true;
}
}

View File

@ -0,0 +1,481 @@
package com.iqser.red.service.redaction.v1.server.model;
import static com.iqser.red.service.redaction.v1.server.service.EntityLogCreatorService.buildEntryState;
import static com.iqser.red.service.redaction.v1.server.service.EntityLogCreatorService.buildEntryType;
import java.awt.geom.Rectangle2D;
import java.time.OffsetDateTime;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.stream.Collectors;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Change;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ChangeType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntityLogEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryState;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.EntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Position;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.ManualChangeFactory;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.Rectangle;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualForceRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.type.DictionaryEntryType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.ManualRedactionType;
import com.iqser.red.service.persistence.service.v1.api.shared.model.redactionlog.RedactionLogEntry;
import com.iqser.red.service.redaction.v1.server.migration.MigrationMapper;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity;
import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualChangeOverwrite;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Image;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.ImageType;
import com.iqser.red.service.redaction.v1.server.service.DictionaryService;
import com.iqser.red.service.redaction.v1.server.service.ManualChangesApplicationService;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Data
@Builder
@AllArgsConstructor
@RequiredArgsConstructor
public final class MigrationEntity {
private final PrecursorEntity precursorEntity;
private final RedactionLogEntry redactionLogEntry;
private final DictionaryService dictionaryService;
private final String dossierTemplateId;
private IEntity migratedEntity;
private String oldId;
private String newId;
private String fileId;
@Builder.Default
List<BaseAnnotation> manualChanges = new LinkedList<>();
public static MigrationEntity fromRedactionLogEntry(RedactionLogEntry redactionLogEntry, String fileId, DictionaryService dictionaryService, String dossierTemplateId) {
boolean hint = dictionaryService.isHint(redactionLogEntry.getType(), dossierTemplateId);
PrecursorEntity precursorEntity = createPrecursorEntity(redactionLogEntry, hint);
if (precursorEntity.getEntityType().equals(EntityType.HINT) && !redactionLogEntry.isHint() && !redactionLogEntry.isRedacted()) {
precursorEntity.ignore(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (redactionLogEntry.lastChangeIsRemoved()) {
precursorEntity.remove(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (lastManualChangeIsRemove(redactionLogEntry)) {
precursorEntity.ignore(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (precursorEntity.isApplied() && redactionLogEntry.isRecommendation()) {
precursorEntity.skip(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
} else if (precursorEntity.isApplied()) {
precursorEntity.apply(precursorEntity.getRuleIdentifier(), precursorEntity.getReason(), precursorEntity.getLegalBasis());
} else {
precursorEntity.skip(precursorEntity.getRuleIdentifier(), precursorEntity.getReason());
}
return MigrationEntity.builder()
.precursorEntity(precursorEntity)
.redactionLogEntry(redactionLogEntry)
.oldId(redactionLogEntry.getId())
.fileId(fileId)
.dictionaryService(dictionaryService)
.dossierTemplateId(dossierTemplateId)
.build();
}
public static MigrationEntity fromRedactionLogImage(RedactionLogEntry redactionLogImage,
Image image,
String fileId,
DictionaryService dictionaryService,
String dossierTemplateId) {
return MigrationEntity.builder()
.redactionLogEntry(redactionLogImage)
.migratedEntity(image)
.oldId(redactionLogImage.getId())
.newId(image.getId())
.fileId(fileId)
.dictionaryService(dictionaryService)
.dossierTemplateId(dossierTemplateId)
.build();
}
private static boolean lastManualChangeIsRemove(RedactionLogEntry redactionLogEntry) {
if (redactionLogEntry.getManualChanges() == null) {
return false;
}
return redactionLogEntry.getManualChanges()
.stream()
.reduce((a, b) -> b)
.map(m -> m.getManualRedactionType().equals(ManualRedactionType.REMOVE_LOCALLY))
.orElse(false);
}
public static PrecursorEntity createPrecursorEntity(RedactionLogEntry redactionLogEntry, boolean hint) {
String ruleIdentifier = buildRuleIdentifier(redactionLogEntry);
List<RectangleWithPage> rectangleWithPages = redactionLogEntry.getPositions()
.stream()
.map(RectangleWithPage::fromRedactionLogRectangle)
.toList();
EntityType entityType = getEntityType(redactionLogEntry, hint);
return PrecursorEntity.builder()
.id(redactionLogEntry.getId())
.value(redactionLogEntry.getValue())
.entityPosition(rectangleWithPages)
.ruleIdentifier(ruleIdentifier)
.reason(Optional.ofNullable(redactionLogEntry.getReason())
.orElse(""))
.legalBasis(redactionLogEntry.getLegalBasis())
.type(redactionLogEntry.getType())
.section(redactionLogEntry.getSection())
.engines(MigrationMapper.getMigratedEngines(redactionLogEntry))
.entityType(entityType)
.applied(redactionLogEntry.isRedacted())
.isDictionaryEntry(redactionLogEntry.isDictionaryEntry())
.isDossierDictionaryEntry(redactionLogEntry.isDossierDictionaryEntry())
.rectangle(redactionLogEntry.isRectangle())
.manualOverwrite(new ManualChangeOverwrite(entityType))
.build();
}
private static String buildRuleIdentifier(RedactionLogEntry redactionLogEntry) {
String ruleIdentifier;
if (redactionLogEntry.getMatchedRule() != null) {
ruleIdentifier = "OLD." + redactionLogEntry.getMatchedRule() + ".0";
} else {
ruleIdentifier = "MAN.5.0"; // pure ManualRedactions used to have no matched rule
}
return ruleIdentifier;
}
private static EntityType getEntityType(RedactionLogEntry redactionLogEntry, boolean hint) {
if (hint) {
return EntityType.HINT;
}
if (redactionLogEntry.isFalsePositive()) {
return EntityType.FALSE_POSITIVE;
}
if (redactionLogEntry.isHint()) {
return EntityType.HINT;
}
if (redactionLogEntry.isRecommendation()) {
return EntityType.RECOMMENDATION;
}
return EntityType.ENTITY;
}
public EntityLogEntry toEntityLogEntry(Map<String, String> oldToNewIdMapping) {
EntityLogEntry entityLogEntry;
if (migratedEntity instanceof Image image) {
entityLogEntry = createEntityLogEntry(image);
} else if (migratedEntity instanceof TextEntity textEntity) {
entityLogEntry = createEntityLogEntry(textEntity);
} else if (migratedEntity instanceof PrecursorEntity entity) {
entityLogEntry = createEntityLogEntry(entity);
} else {
throw new UnsupportedOperationException("Unknown subclass " + migratedEntity.getClass());
}
entityLogEntry.setManualChanges(ManualChangeFactory.toLocalManualChangeList(migratedEntity.getManualOverwrite().getManualChangeLog(), true));
entityLogEntry.setColor(redactionLogEntry.getColor());
entityLogEntry.setChanges(redactionLogEntry.getChanges()
.stream()
.map(MigrationMapper::toEntityLogChanges)
.toList());
entityLogEntry.setReference(migrateSetOfIds(redactionLogEntry.getReference(), oldToNewIdMapping));
entityLogEntry.setImportedRedactionIntersections(migrateSetOfIds(redactionLogEntry.getImportedRedactionIntersections(), oldToNewIdMapping));
entityLogEntry.setEngines(MigrationMapper.getMigratedEngines(redactionLogEntry));
if (entityLogEntry.getEntryType().equals(EntryType.HINT) && lastManualChangeIsRemoveLocally(entityLogEntry)) {
entityLogEntry.setState(EntryState.IGNORED);
}
if (redactionLogEntry.isImported() && redactionLogEntry.getValue() == null) {
entityLogEntry.setValue("Imported Redaction");
}
if(entityLogEntry.getChanges() != null
&& !entityLogEntry.getChanges().isEmpty()
&& entityLogEntry.getChanges().stream().map(Change::getType).toList().get(entityLogEntry.getChanges().size() - 1).equals(ChangeType.REMOVED)) {
entityLogEntry.setState(EntryState.REMOVED);
if (!entityLogEntry.getManualChanges().isEmpty()) {
entityLogEntry.getManualChanges()
.removeIf(manualChange -> manualChange.getManualRedactionType()
.equals(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.FORCE));
}
}
return entityLogEntry;
}
private static boolean lastManualChangeIsRemoveLocally(EntityLogEntry entityLogEntry) {
return entityLogEntry.getManualChanges()
.stream()
.reduce((a, b) -> b)
.filter(mc -> mc.getManualRedactionType().equals(com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.ManualRedactionType.REMOVE))
.isPresent();
}
private Set<String> migrateSetOfIds(Set<String> ids, Map<String, String> oldToNewIdMapping) {
if (ids == null) {
return Collections.emptySet();
}
return ids.stream()
.map(oldToNewIdMapping::get)
.collect(Collectors.toSet());
}
public EntityLogEntry createEntityLogEntry(Image image) {
String imageType = image.getImageType().equals(ImageType.OTHER) ? "image" : image.getImageType().toString().toLowerCase(Locale.ENGLISH);
List<Position> positions = getPositionsFromOverride(image).orElse(List.of(new Position(image.getPosition(), image.getPage().getNumber())));
return EntityLogEntry.builder()
.id(image.getId())
.value(image.getValue())
.type(imageType)
.reason(image.buildReasonWithManualChangeDescriptions())
.legalBasis(image.getManualOverwrite().getLegalBasis()
.orElse(redactionLogEntry.getLegalBasis()))
.matchedRule(image.getMatchedRule().getRuleIdentifier().toString())
.dictionaryEntry(false)
.positions(positions)
.containingNodeId(image.getTreeId())
.closestHeadline(image.getHeadline().getTextBlock().getSearchText())
.section(image.getManualOverwrite().getSection()
.orElse(redactionLogEntry.getSection()))
.textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore())
.imageHasTransparency(image.isTransparent())
.state(buildEntryState(image))
.entryType(dictionaryService.isHint(imageType, dossierTemplateId) ? EntryType.IMAGE_HINT : EntryType.IMAGE)
.build();
}
public EntityLogEntry createEntityLogEntry(PrecursorEntity precursorEntity) {
return EntityLogEntry.builder()
.id(precursorEntity.getId())
.reason(precursorEntity.buildReasonWithManualChangeDescriptions())
.legalBasis(precursorEntity.getManualOverwrite().getLegalBasis()
.orElse(redactionLogEntry.getLegalBasis()))
.value(precursorEntity.value())
.type(precursorEntity.type())
.state(buildEntryState(precursorEntity))
.entryType(buildEntryType(precursorEntity))
.section(precursorEntity.getManualOverwrite().getSection()
.orElse(redactionLogEntry.getSection()))
.textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore())
.containingNodeId(Collections.emptyList())
.closestHeadline("")
.matchedRule(precursorEntity.getMatchedRule().getRuleIdentifier().toString())
.dictionaryEntry(precursorEntity.isDictionaryEntry())
.dossierDictionaryEntry(precursorEntity.isDossierDictionaryEntry())
.startOffset(-1)
.endOffset(-1)
.positions(precursorEntity.getManualOverwrite().getPositions()
.orElse(precursorEntity.getEntityPosition())
.stream()
.map(entityPosition -> new Position(entityPosition.rectangle2D(), entityPosition.pageNumber()))
.toList())
.engines(Collections.emptySet())
.build();
}
public EntityLogEntry createEntityLogEntry(TextEntity entity) {
assert entity.getPositionsOnPagePerPage().size() == 1;
List<Position> rectanglesPerLine = getRectanglesPerLine(entity);
return EntityLogEntry.builder()
.id(entity.getId())
.positions(rectanglesPerLine)
.reason(entity.buildReasonWithManualChangeDescriptions())
.legalBasis(entity.getManualOverwrite().getLegalBasis()
.orElse(redactionLogEntry.getLegalBasis()))
.value(entity.getManualOverwrite().getValue()
.orElse(entity.getMatchedRule().isWriteValueWithLineBreaks() ? entity.getValueWithLineBreaks() : entity.getValue()))
.type(entity.type())
.section(entity.getManualOverwrite().getSection()
.orElse(redactionLogEntry.getSection()))
.textAfter(entity.getTextAfter())
.textBefore(entity.getTextBefore())
.containingNodeId(entity.getDeepestFullyContainingNode().getTreeId())
.closestHeadline(entity.getDeepestFullyContainingNode().getHeadline().getTextBlock().getSearchText())
.matchedRule(entity.getMatchedRule().getRuleIdentifier().toString())
.dictionaryEntry(entity.isDictionaryEntry())
.startOffset(entity.getTextRange().start())
.endOffset(entity.getTextRange().end())
.dossierDictionaryEntry(entity.isDossierDictionaryEntry())
.engines(entity.getEngines() != null ? entity.getEngines() : Collections.emptySet())
.state(buildEntryState(entity))
.entryType(buildEntryType(entity))
.build();
}
private static List<Position> getRectanglesPerLine(TextEntity entity) {
return getPositionsFromOverride(entity).orElse(entity.getPositionsOnPagePerPage()
.get(0).getRectanglePerLine()
.stream()
.map(rectangle2D -> new Position(rectangle2D,
entity.getPositionsOnPagePerPage()
.get(0).getPage().getNumber()))
.toList());
}
private static Optional<List<Position>> getPositionsFromOverride(IEntity entity) {
return entity.getManualOverwrite().getPositions()
.map(rects -> rects.stream()
.map(r -> new Position(r.rectangle2D(), r.pageNumber()))
.toList());
}
public boolean hasManualChangesOrComments(Set<String> entitiesWithComments, Set<String> entitiesWithUnprocessedChanges) {
return !(redactionLogEntry.getManualChanges() == null || redactionLogEntry.getManualChanges().isEmpty()) || //
!(redactionLogEntry.getComments() == null || redactionLogEntry.getComments().isEmpty()) //
|| hasManualChanges() || entitiesWithComments.contains(oldId) || entitiesWithUnprocessedChanges.contains(oldId);
}
public boolean hasManualChanges() {
return !manualChanges.isEmpty();
}
public void applyManualChanges(List<BaseAnnotation> manualChangesToApply, ManualChangesApplicationService manualChangesApplicationService) {
manualChanges.addAll(manualChangesToApply);
manualChangesToApply.forEach(manualChange -> {
if (manualChange instanceof ManualResizeRedaction manualResizeRedaction && migratedEntity instanceof TextEntity textEntity) {
manualResizeRedaction.setAnnotationId(newId);
manualChangesApplicationService.resize(textEntity, manualResizeRedaction);
} else if (manualChange instanceof ManualRecategorization manualRecategorization && migratedEntity instanceof Image image) {
image.setImageType(ImageType.fromString(manualRecategorization.getType()));
migratedEntity.getManualOverwrite().addChange(manualChange);
} else {
migratedEntity.getManualOverwrite().addChange(manualChange);
}
});
}
public ManualRedactionEntry buildManualRedactionEntry() {
assert hasManualChanges();
// currently we need to insert a manual redaction entry, whenever an entity has been resized.
String user = manualChanges.stream()
.filter(mc -> mc instanceof ManualResizeRedaction)
.findFirst()
.orElse(manualChanges.get(0)).getUser();
OffsetDateTime requestDate = manualChanges.get(0).getRequestDate();
return ManualRedactionEntry.builder()
.annotationId(newId)
.fileId(fileId)
.user(user)
.requestDate(requestDate)
.type(redactionLogEntry.getType())
.value(redactionLogEntry.getValue())
.reason(redactionLogEntry.getReason())
.legalBasis(redactionLogEntry.getLegalBasis())
.section(redactionLogEntry.getSection())
.rectangle(false)
.addToDictionary(false)
.addToDossierDictionary(false)
.positions(buildPositions(migratedEntity))
.textAfter(redactionLogEntry.getTextAfter())
.textBefore(redactionLogEntry.getTextBefore())
.dictionaryEntryType(DictionaryEntryType.ENTRY)
.build();
}
private List<Rectangle> buildPositions(IEntity entity) {
if (entity instanceof TextEntity textEntity) {
var positionsOnPage = textEntity.getPositionsOnPagePerPage()
.get(0);
return positionsOnPage.getRectanglePerLine()
.stream()
.map(p -> new Rectangle((float) p.getX(), (float) p.getY(), (float) p.getWidth(), (float) p.getHeight(), positionsOnPage.getPage().getNumber()))
.toList();
}
if (entity instanceof PrecursorEntity pEntity) {
return pEntity.getManualOverwrite().getPositions()
.orElse(pEntity.getEntityPosition())
.stream()
.map(p -> new Rectangle((float) p.rectangle2D().getX(),
(float) p.rectangle2D().getY(),
(float) p.rectangle2D().getWidth(),
(float) p.rectangle2D().getHeight(),
p.pageNumber()))
.toList();
}
if (entity instanceof Image image) {
Rectangle2D position = image.getManualOverwrite().getPositions()
.map(p -> p.get(0).rectangle2D())
.orElse(image.getPosition());
return List.of(new Rectangle((float) position.getX(), (float) position.getY(), (float) position.getWidth(), (float) position.getHeight(), image.getPage().getNumber()));
} else {
throw new UnsupportedOperationException();
}
}
public boolean needsManualEntry() {
return manualChanges.stream()
.anyMatch(mc -> mc instanceof ManualResizeRedaction && !((ManualResizeRedaction) mc).getUpdateDictionary()) && !(migratedEntity instanceof Image);
}
public boolean needsForceDeletion() {
return manualChanges.stream()
.anyMatch(mc -> mc instanceof ManualForceRedaction) && this.precursorEntity != null && this.precursorEntity.removed();
}
}

View File

@ -1,6 +1,5 @@
package com.iqser.red.service.redaction.v1.server.model;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.stream.Stream;
@ -18,7 +17,7 @@ import lombok.experimental.FieldDefaults;
*/
@Getter
@AllArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
@FieldDefaults(level = AccessLevel.PRIVATE, makeFinal = true)
public class NerEntities {
List<NerEntity> nerEntityList;
@ -30,14 +29,6 @@ public class NerEntities {
}
public void merge(NerEntities other) {
List<NerEntity> mergedList = new ArrayList<>(nerEntityList);
mergedList.addAll(other.getNerEntityList());
nerEntityList = mergedList;
}
/**
* Checks if there are any entities of a specified type.
*
@ -67,28 +58,7 @@ public class NerEntities {
/**
* Represents a single NER entity with its value, text range, and type.
*/
public record NerEntity(String value, TextRange textRange, String type, Double confidence, Engine engine) {
public NerEntity(String value, TextRange textRange, String type) {
this(value, textRange, type, null, Engine.NER);
}
}
public enum Engine {
NER,
CLOUD_NER,
LLM_NER
}
public static com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine mapToPrimaryEngine(NerEntities.Engine nerEntityEngine) {
return switch (nerEntityEngine) {
case NER, CLOUD_NER -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.NER;
case LLM_NER -> com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine.LLM_NER;
};
public record NerEntity(String value, TextRange textRange, String type) {
}

View File

@ -2,7 +2,6 @@ package com.iqser.red.service.redaction.v1.server.model;
import static com.iqser.red.service.redaction.v1.server.service.NotFoundImportedEntitiesService.IMPORTED_REDACTION_TYPE;
import java.util.Collection;
import java.util.List;
import java.util.Optional;
import java.util.PriorityQueue;
@ -16,12 +15,10 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityEventListener;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.IEntity;
import com.iqser.red.service.redaction.v1.server.model.document.entity.ManualChangeOverwrite;
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule;
import com.iqser.red.service.redaction.v1.server.model.document.entity.RectangleWithPage;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
@ -54,8 +51,7 @@ public class PrecursorEntity implements IEntity {
@Builder.Default
PriorityQueue<MatchedRule> matchedRuleList = new PriorityQueue<>();
@Builder.Default
ManualChangeOverwrite manualOverwrite = new ManualChangeOverwrite();
ManualChangeOverwrite manualOverwrite;
public static PrecursorEntity fromManualRedactionEntry(ManualRedactionEntry manualRedactionEntry, boolean hint) {
@ -129,7 +125,6 @@ public class PrecursorEntity implements IEntity {
.id(importedRedaction.getId())
.value(value)
.entityPosition(rectangleWithPages)
.ruleIdentifier("IMP.0.0")
.reason(Optional.ofNullable(importedRedaction.getReason())
.orElse(""))
.legalBasis(Optional.ofNullable(importedRedaction.getLegalBasis())
@ -185,44 +180,6 @@ public class PrecursorEntity implements IEntity {
}
/**
* @return true when this entity is of EntityType ENTITY or HINT
*/
public boolean validEntityType() {
return entityType.equals(EntityType.ENTITY) || entityType.equals(EntityType.HINT);
}
@Override
public boolean valid() {
return active() && validEntityType();
}
@Override
public void addEntityEventListener(EntityEventListener listener) {
throw new UnsupportedOperationException("PrecursorEntity does not support entityEventListeners");
}
@Override
public void removeEntityEventListener(EntityEventListener listener) {
throw new UnsupportedOperationException("PrecursorEntity does not support entityEventListeners");
}
@Override
public Collection<EntityEventListener> getEntityEventListeners() {
throw new UnsupportedOperationException("PrecursorEntity does not support entityEventListeners");
}
private static EntityType getEntityType(EntryType entryType) {
switch (entryType) {

View File

@ -1,4 +1,4 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
package com.iqser.red.service.redaction.v1.server.model;
import java.awt.geom.Rectangle2D;

View File

@ -3,7 +3,7 @@ package com.iqser.red.service.redaction.v1.server.model.component;
import java.util.Collection;
import java.util.List;
import com.iqser.red.service.redaction.v1.server.model.document.entity.RuleIdentifier;
import com.iqser.red.service.redaction.v1.server.model.drools.RuleIdentifier;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;

View File

@ -1,130 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.*;
import java.util.stream.Stream;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
public abstract class AbstractDictionarySearch implements DictionarySearch {
protected final Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap;
public AbstractDictionarySearch(Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap) {
this.keyWordToIdentifiersMap = keyWordToIdentifiersMap;
}
@Override
public Stream<MatchTextRange> getBoundaries(CharSequence text) {
TextContext textContext = new TextContext(text);
return getMatchTextRangeStream(textContext);
}
@Override
public Stream<MatchTextRange> getBoundaries(CharSequence text, TextRange region) {
CharSequence subText = text.subSequence(region.start(), region.end());
TextContext textContext = new TextContext(subText, region.start());
return getMatchTextRangeStream(textContext);
}
@Override
public Stream<MatchTextRange> getBoundaries(TextBlock textBlock) {
return getBoundaries(textBlock, textBlock.getTextRange());
}
@Override
public Stream<MatchPosition> getMatches(String text) {
TextContext textContext = new TextContext(text);
List<MatchPosition> matches = new ArrayList<>();
parseText(textContext.getLowerText(), (begin, end, value) -> addMatchPositionsForHit(textContext, matches, new Hit(begin, end, value)));
return matches.stream();
}
private Stream<MatchTextRange> getMatchTextRangeStream(TextContext textContext) {
List<MatchTextRange> matches = new ArrayList<>();
parseText(textContext.getLowerText(), (begin, end, value) -> addMatchesForHit(textContext, matches, new Hit(begin, end, value)));
return matches.stream();
}
protected abstract void parseText(CharSequence text, HitHandler handler);
protected void addMatchesForHit(TextContext textContext, List<MatchTextRange> matches, Hit hit) {
int start = textContext.getStart(hit.begin);
int end = textContext.getEnd(hit.end);
String matchedText = textContext.getMatchedText(hit.begin, hit.end);
List<DictionaryIdentifierWithKeyword> idWithKeywords = hit.value;
for (DictionaryIdentifierWithKeyword idkw : idWithKeywords) {
if (idkw.identifier().caseSensitive()) {
if (matchedText.equals(idkw.keyword())) {
matches.add(new MatchTextRange(idkw.identifier(), new TextRange(start, end)));
}
} else {
matches.add(new MatchTextRange(idkw.identifier(), new TextRange(start, end)));
}
}
}
protected void addMatchPositionsForHit(TextContext textContext, List<MatchPosition> matches, Hit hit) {
int start = textContext.getStart(hit.begin);
int end = textContext.getEnd(hit.end);
String matchedText = textContext.getMatchedText(hit.begin, hit.end);
List<DictionaryIdentifierWithKeyword> idWithKeywords = hit.value;
for (DictionaryIdentifierWithKeyword idkw : idWithKeywords) {
MatchPosition matchPosition = new MatchPosition(idkw.identifier(), start, end);
if (idkw.identifier().caseSensitive()) {
if (matchedText.equals(idkw.keyword())) {
matches.add(matchPosition);
}
} else {
matches.add(matchPosition);
}
}
}
protected interface HitHandler {
void handle(int begin, int end, List<DictionaryIdentifierWithKeyword> value);
}
protected static class Hit {
final int begin;
final int end;
final List<DictionaryIdentifierWithKeyword> value;
Hit(int begin, int end, List<DictionaryIdentifierWithKeyword> value) {
this.begin = begin;
this.end = end;
this.value = value;
}
}
}

View File

@ -1,32 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.List;
import java.util.Map;
import com.roklenarcic.util.strings.AhoCorasickMap;
import com.roklenarcic.util.strings.MapMatchListener;
import com.roklenarcic.util.strings.StringMap;
public class AhoCorasickMapDictionarySearch extends AbstractDictionarySearch {
private final StringMap<List<DictionaryIdentifierWithKeyword>> map;
public AhoCorasickMapDictionarySearch(Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap) {
super(keyWordToIdentifiersMap);
map = new AhoCorasickMap<>(keyWordToIdentifiersMap.keySet(), keyWordToIdentifiersMap.values(), false);
}
@Override
protected void parseText(CharSequence text, HitHandler handler) {
MapMatchListener<List<DictionaryIdentifierWithKeyword>> listener = (haystack, startPosition, endPosition, value) -> {
handler.handle(startPosition, endPosition, value);
return true;
};
map.match(text.toString(), listener);
}
}

View File

@ -2,16 +2,20 @@ package com.iqser.red.service.redaction.v1.server.model.dictionary;
import static java.lang.String.format;
import java.util.*;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.apache.commons.lang3.StringUtils;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntry;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule;
import com.iqser.red.service.redaction.v1.server.model.document.entity.Relation;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.utils.Patterns;
import com.iqser.red.service.redaction.v1.server.utils.exception.NotFoundException;
@ -25,77 +29,28 @@ import lombok.Getter;
@Data
public class Dictionary {
private final Map<String, Map<Level, DictionaryModel>> localAccessMap = new HashMap<>();
@Getter
private List<DictionaryModel> dictionaryModels;
private Map<String, DictionaryModel> localAccessMap = new HashMap<>();
private final DictionaryVersion version;
private final DictionarySearch dictionarySearch;
public enum Level {
DOSSIER_TEMPLATE,
DOSSIER
}
@Getter
private DictionaryVersion version;
Dictionary(List<DictionaryModel> dictionaryModels, DictionaryVersion version, DictionarySearch dictionarySearch) {
public Dictionary(List<DictionaryModel> dictionaryModels, DictionaryVersion version) {
dictionaryModels.forEach(dm -> localAccessMap.put(dm.getType(), Map.of(getLevel(dm.isDossierDictionary()), dm)));
this.dictionaryModels = dictionaryModels;
this.dictionaryModels.forEach(dm -> localAccessMap.put(dm.getType(), dm));
this.version = version;
this.dictionarySearch = dictionarySearch;
}
public boolean containsType(String type) {
Map<Level, DictionaryModel> levelMap = localAccessMap.get(type);
return !(levelMap == null || levelMap.isEmpty());
}
private Level getLevel(boolean isDossierDictionary) {
return isDossierDictionary ? Level.DOSSIER : Level.DOSSIER_TEMPLATE;
}
/**
* Determines the default level for a given type based on the levels present.
* If both levels are present, it defaults to {@code Level.DOSSIER}.
*
* @param type The type to determine the default level for.
* @return The default {@link Level} for the specified type.
* @throws NotFoundException If the type is not found in the dictionary.
*/
private Level getDefaultLevel(String type) {
Map<Level, DictionaryModel> levelMap = localAccessMap.get(type);
if (levelMap == null || levelMap.isEmpty()) {
throw new NotFoundException("Type: " + type + " is not found");
}
if (levelMap.containsKey(Level.DOSSIER)) {
return Level.DOSSIER;
} else {
// Use whatever level is present
return levelMap.keySet()
.iterator().next();
}
}
public int getDictionaryRank(String type, Level level) {
if (!localAccessMap.containsKey(type)) {
return 0;
}
DictionaryModel model = localAccessMap.get(type)
.get(level);
return model != null ? model.getRank() : 0;
}
public int getDictionaryRank(String type) {
return getDictionaryRank(type, getDefaultLevel(type));
if (!localAccessMap.containsKey(type)) {
return 0;
}
return localAccessMap.get(type).getRank();
}
@ -106,21 +61,11 @@ public class Dictionary {
*/
public boolean hasLocalEntries() {
return getDictionaryModels().stream()
return dictionaryModels.stream()
.anyMatch(dm -> !dm.getLocalEntriesWithMatchedRules().isEmpty());
}
public List<DictionaryModel> getDictionaryModels() {
return localAccessMap.values()
.stream()
.flatMap(levelDictionaryModelMap -> levelDictionaryModelMap.values()
.stream())
.toList();
}
public Set<String> getTypes() {
return localAccessMap.keySet();
@ -128,144 +73,56 @@ public class Dictionary {
/**
* Retrieves the {@link DictionaryModel} of a specified type and level.
*
* @param type The type of dictionary model to retrieve.
* @param level The level of the dictionary model to retrieve.
* @return The {@link DictionaryModel} of the specified type and level.
* @throws NotFoundException If the specified type or level is not found in the dictionary.
*/
public DictionaryModel getType(String type, Level level) {
Map<Level, DictionaryModel> levelMap = localAccessMap.get(type);
if (levelMap == null || !levelMap.containsKey(level)) {
throw new NotFoundException("Type: " + type + " with level: " + level + " is not found");
}
return levelMap.get(level);
}
/**
* Retrieves the {@link DictionaryModel} of a specified type at the default level.
* Retrieves the {@link DictionaryModel} of a specified type.
*
* @param type The type of dictionary model to retrieve.
* @return The {@link DictionaryModel} of the specified type at the default level.
* @return The {@link DictionaryModel} of the specified type.
* @throws NotFoundException If the specified type is not found in the dictionary.
*/
public DictionaryModel getType(String type) {
return getType(type, getDefaultLevel(type));
DictionaryModel model = localAccessMap.get(type);
if (model == null) {
throw new NotFoundException("Type: " + type + " is not found");
}
return model;
}
/**
* Checks if the dictionary of a specific type and level is considered a hint.
*
* @param type The type of dictionary to check.
* @param level The level of the dictionary to check.
* @return true if the dictionary model is marked as a hint, false otherwise.
*/
public boolean isHint(String type, Level level) {
DictionaryModel model = localAccessMap.get(type)
.get(level);
return model != null && model.isHint();
}
/**
* Checks if the dictionary of a specific type is considered a hint at the default level.
* Checks if the dictionary of a specific type is considered a hint.
*
* @param type The type of dictionary to check.
* @return true if the dictionary model is marked as a hint, false otherwise.
*/
public boolean isHint(String type) {
return isHint(type, getDefaultLevel(type));
DictionaryModel model = localAccessMap.get(type);
if (model != null) {
return model.isHint();
}
return false;
}
/**
* Checks if the dictionary of a specific type and level is case-insensitive.
*
* @param type The type of dictionary to check.
* @param level The level of the dictionary to check.
* @return true if the dictionary is case-insensitive, false otherwise.
*/
public boolean isCaseInsensitiveDictionary(String type, Level level) {
DictionaryModel dictionaryModel = localAccessMap.get(type)
.get(level);
return dictionaryModel != null && dictionaryModel.isCaseInsensitive();
}
/**
* Checks if the dictionary of a specific type is case-insensitive at the default level.
* Checks if the dictionary of a specific type is case-insensitive.
*
* @param type The type of dictionary to check.
* @return true if the dictionary is case-insensitive, false otherwise.
*/
public boolean isCaseInsensitiveDictionary(String type) {
return isCaseInsensitiveDictionary(type, getDefaultLevel(type));
DictionaryModel dictionaryModel = localAccessMap.get(type);
if (dictionaryModel != null) {
return dictionaryModel.isCaseInsensitive();
}
return false;
}
/**
* Adds a local dictionary entry of a specific type and level.
*
* @param type The type of dictionary to add the entry to.
* @param value The value of the entry.
* @param matchedRules A collection of {@link MatchedRule} associated with the entry.
* @param alsoAddLastname Indicates whether to also add the lastname separately as an entry.
* @param level The level of the dictionary where the entry should be added.
* @throws IllegalArgumentException If the specified type does not exist within the dictionary, if the type
* does not have any local entries defined, or if the provided value is
* blank. This ensures that only valid, non-empty entries
* are added to the dictionary.
*/
private void addLocalDictionaryEntry(String type, String value, Collection<MatchedRule> matchedRules, boolean alsoAddLastname, Level level) {
if (value.isBlank()) {
return;
}
Map<Level, DictionaryModel> levelMap = localAccessMap.get(type);
if (levelMap == null || !levelMap.containsKey(level)) {
throw new IllegalArgumentException(format("DictionaryModel of type %s with level %s does not exist", type, level));
}
DictionaryModel dictionaryModel = levelMap.get(level);
if (dictionaryModel.getLocalEntriesWithMatchedRules() == null) {
throw new IllegalArgumentException(format("DictionaryModel of type %s has no local Entries", type));
}
if (StringUtils.isEmpty(value)) {
throw new IllegalArgumentException(format("%s is not a valid dictionary entry", value));
}
boolean isCaseInsensitive = dictionaryModel.isCaseInsensitive();
Set<MatchedRule> matchedRulesSet = new HashSet<>(matchedRules);
String cleanedValue = value;
if (isCaseInsensitive) {
cleanedValue = cleanedValue.toLowerCase(Locale.US);
}
dictionaryModel.getLocalEntriesWithMatchedRules()
.merge(cleanedValue.trim(),
matchedRulesSet,
(set1, set2) -> Stream.concat(set1.stream(), set2.stream())
.collect(Collectors.toSet()));
if (alsoAddLastname) {
String lastname = cleanedValue.split(" ")[0];
dictionaryModel.getLocalEntriesWithMatchedRules()
.merge(lastname,
matchedRulesSet,
(set1, set2) -> Stream.concat(set1.stream(), set2.stream())
.collect(Collectors.toSet()));
}
}
/**
* Adds a local dictionary entry of a specific type at the default level.
* Adds a local dictionary entry of a specific type.
*
* @param type The type of dictionary to add the entry to.
* @param value The value of the entry.
@ -278,7 +135,40 @@ public class Dictionary {
*/
private void addLocalDictionaryEntry(String type, String value, Collection<MatchedRule> matchedRules, boolean alsoAddLastname) {
addLocalDictionaryEntry(type, value, matchedRules, alsoAddLastname, getDefaultLevel(type));
if (value.isBlank()) {
return;
}
if (localAccessMap.get(type) == null) {
throw new IllegalArgumentException(format("DictionaryModel of type %s does not exist", type));
}
if (localAccessMap.get(type).getLocalEntriesWithMatchedRules() == null) {
throw new IllegalArgumentException(format("DictionaryModel of type %s has no local Entries", type));
}
if (StringUtils.isEmpty(value)) {
throw new IllegalArgumentException(format("%s is not a valid dictionary entry", value));
}
boolean isCaseInsensitive = localAccessMap.get(type).isCaseInsensitive();
Set<MatchedRule> matchedRulesSet = new HashSet<>(matchedRules);
String cleanedValue = value;
if (isCaseInsensitive) {
cleanedValue = cleanedValue.toLowerCase(Locale.US);
}
localAccessMap.get(type)
.getLocalEntriesWithMatchedRules()
.merge(cleanedValue.trim(),
matchedRulesSet,
(set1, set2) -> Stream.concat(set1.stream(), set2.stream())
.collect(Collectors.toSet()));
if (alsoAddLastname) {
String lastname = cleanedValue.split(" ")[0];
localAccessMap.get(type)
.getLocalEntriesWithMatchedRules()
.merge(lastname,
matchedRulesSet,
(set1, set2) -> Stream.concat(set1.stream(), set2.stream())
.collect(Collectors.toSet()));
}
}
@ -286,22 +176,10 @@ public class Dictionary {
* Recommends a text entity for inclusion in every dictionary model without separating the last name.
*
* @param textEntity The {@link TextEntity} to be recommended.
* @param level The level of the dictionary where the recommendation should be added.
*/
public void recommendEverywhere(TextEntity textEntity, Level level) {
addLocalDictionaryEntry(textEntity.type(), textEntity.getValue(), textEntity.getMatchedRuleList(), false, level);
}
/**
* Recommends a text entity for inclusion in every dictionary model without separating the last name at the default level.
*
* @param textEntity The {@link TextEntity} to be recommended.
*/
public void recommendEverywhere(TextEntity textEntity) {
recommendEverywhere(textEntity, getDefaultLevel(textEntity.type()));
addLocalDictionaryEntry(textEntity.type(), textEntity.getValue(), textEntity.getMatchedRuleList(), false);
}
@ -309,22 +187,10 @@ public class Dictionary {
* Recommends a text entity for inclusion in every dictionary model with the last name added separately.
*
* @param textEntity The {@link TextEntity} to be recommended.
* @param level The level of the dictionary where the recommendation should be added.
*/
public void recommendEverywhereWithLastNameSeparately(TextEntity textEntity, Level level) {
addLocalDictionaryEntry(textEntity.type(), textEntity.getValue(), textEntity.getMatchedRuleList(), true, level);
}
/**
* Recommends a text entity for inclusion in every dictionary model with the last name added separately at the default level.
*
* @param textEntity The {@link TextEntity} to be recommended.
*/
public void recommendEverywhereWithLastNameSeparately(TextEntity textEntity) {
recommendEverywhereWithLastNameSeparately(textEntity, getDefaultLevel(textEntity.type()));
addLocalDictionaryEntry(textEntity.type(), textEntity.getValue(), textEntity.getMatchedRuleList(), true);
}
@ -332,22 +198,11 @@ public class Dictionary {
* Adds multiple author names contained within a text entity as recommendations in the dictionary.
*
* @param textEntity The {@link TextEntity} containing author names to be added.
* @param level The level of the dictionary where the recommendations should be added.
*/
public void addMultipleAuthorsAsRecommendation(TextEntity textEntity, Level level) {
splitIntoAuthorNames(textEntity).forEach(authorName -> addLocalDictionaryEntry(textEntity.type(), authorName, textEntity.getMatchedRuleList(), true, level));
}
/**
* Adds multiple author names contained within a text entity as recommendations in the dictionary at the default level.
*
* @param textEntity The {@link TextEntity} containing author names to be added.
*/
public void addMultipleAuthorsAsRecommendation(TextEntity textEntity) {
addMultipleAuthorsAsRecommendation(textEntity, getDefaultLevel(textEntity.type()));
splitIntoAuthorNames(textEntity).forEach(authorName -> addLocalDictionaryEntry(textEntity.type(), authorName, textEntity.getMatchedRuleList(), true));
}

View File

@ -1,90 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Set;
import org.springframework.stereotype.Service;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntry;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntryModel;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import lombok.RequiredArgsConstructor;
import lombok.SneakyThrows;
@Service
@RequiredArgsConstructor
public class DictionaryFactory {
@SneakyThrows
public Dictionary create(List<DictionaryModel> dictionaryModels, DictionaryVersion dictionaryVersion) {
Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap = computeStringIdentifiersMap(dictionaryModels);
DictionarySearch dictionarySearch = getDictionarySearch(keyWordToIdentifiersMap);
return new Dictionary(dictionaryModels, dictionaryVersion, dictionarySearch);
}
private static DictionarySearch getDictionarySearch(Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap) {
// a more sophisticated selection of the dictionarySearch could be done here
// but as we do not have the need to fine-tune at the moment we use the all-rounder solution, which is the AhoCoraSickMapDictionarySearch
// based on this repository https://github.com/RokLenarcic/AhoCorasick
// This is an outline how a more complex dictionarySearch decision could be made:
// if (!redactionServiceSettings.isPriorityMode() && keyWordToIdentifiersMap.keySet().size() < 50_000) {
// dictionarySearch = new DoubleArrayTrieDictionarySearch(keyWordToIdentifiersMap);
// } else {
// dictionarySearch = new AhoCorasickMapDictionarySearch(keyWordToIdentifiersMap);
// }
return new AhoCorasickMapDictionarySearch(keyWordToIdentifiersMap);
}
protected static Map<String, List<DictionaryIdentifierWithKeyword>> computeStringIdentifiersMap(List<DictionaryModel> dictionaryModels) {
Map<String, List<DictionaryIdentifierWithKeyword>> stringToIdentifiersMap = new HashMap<>();
for (DictionaryModel model : dictionaryModels) {
// Add entries for different entity types
addEntriesToMap(stringToIdentifiersMap, model, model.isHint() ? EntityType.HINT : EntityType.ENTITY, model.getEntries(), false);
addEntriesToMap(stringToIdentifiersMap, model, EntityType.FALSE_POSITIVE, model.getFalsePositives(), false);
addEntriesToMap(stringToIdentifiersMap, model, EntityType.FALSE_RECOMMENDATION, model.getFalseRecommendations(), false);
if (model.isDossierDictionary()) {
addEntriesToMap(stringToIdentifiersMap, model, EntityType.DICTIONARY_REMOVAL, model.getEntries(), true);
}
}
return stringToIdentifiersMap;
}
private static void addEntriesToMap(Map<String, List<DictionaryIdentifierWithKeyword>> stringToIdentifiersMap,
DictionaryModel model,
EntityType entityType,
Set<DictionaryEntryModel> entries,
boolean isDeleted) {
DictionaryIdentifier identifier = new DictionaryIdentifier(model.getType(), entityType, model.isDossierDictionary(), !model.isCaseInsensitive());
List<String> values = entries.stream()
.filter(entry -> entry.isDeleted() == isDeleted)
.map(DictionaryEntry::getValue)
.toList();
for (String value : values) {
DictionaryIdentifierWithKeyword idWithKeyword = new DictionaryIdentifierWithKeyword(identifier, value);
String key = value.toLowerCase(Locale.ROOT);
stringToIdentifiersMap.computeIfAbsent(key, k -> new ArrayList<>()).add(idWithKeyword);
}
}
}

View File

@ -1,8 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
public record DictionaryIdentifier(String type, EntityType entityType, boolean dossierDictionaryEntry, boolean caseSensitive) {
}

View File

@ -1,51 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import org.ahocorasick.trie.PayloadEmit;
import org.ahocorasick.trie.PayloadTrie;
import java.util.Collection;
public final class DictionaryIdentifierTrie {
private final PayloadTrie<DictionaryIdentifier> trie;
private DictionaryIdentifierTrie(PayloadTrie<DictionaryIdentifier> trie) {
this.trie = trie;
}
public static class DictionaryIdentifierTrieBuilder {
private final PayloadTrie.PayloadTrieBuilder<DictionaryIdentifier> builder;
public DictionaryIdentifierTrieBuilder() {
this.builder = PayloadTrie.builder();
}
public DictionaryIdentifierTrieBuilder ignoreCase() {
builder.ignoreCase();
return this;
}
public DictionaryIdentifierTrieBuilder addKeyword(String keyword, DictionaryIdentifier payload) {
builder.addKeyword(keyword, payload);
return this;
}
public DictionaryIdentifierTrieBuilder addKeywords(Collection<String> keywords, DictionaryIdentifier payload) {
for (String keyword : keywords) {
builder.addKeyword(keyword, payload);
}
return this;
}
public DictionaryIdentifierTrie build() {
return new DictionaryIdentifierTrie(builder.build());
}
}
public Collection<PayloadEmit<DictionaryIdentifier>> parseText(CharSequence text) {
return trie.parseText(text);
}
public boolean containsMatch(CharSequence text) {
return trie.containsMatch(text);
}
}

View File

@ -1,5 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
public record DictionaryIdentifierWithKeyword(DictionaryIdentifier identifier, String keyword) {
}

View File

@ -1,12 +1,13 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.io.Serializable;
import java.util.HashMap;
import java.util.Locale;
import java.util.Set;
import java.util.stream.Collectors;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntry;
import com.iqser.red.service.dictionarymerge.commons.DictionaryEntryModel;
import com.iqser.red.service.persistence.service.v1.api.shared.model.dossiertemplate.type.Type;
import com.iqser.red.service.redaction.v1.server.model.document.entity.MatchedRule;
import lombok.Data;
@ -20,7 +21,7 @@ import lombok.extern.slf4j.Slf4j;
*/
@Data
@Slf4j
public class DictionaryModel implements Cloneable {
public class DictionaryModel implements Serializable {
private final String type;
private final int rank;
@ -32,8 +33,13 @@ public class DictionaryModel implements Cloneable {
private final Set<DictionaryEntryModel> falsePositives;
private final Set<DictionaryEntryModel> falseRecommendations;
private transient SearchImplementation entriesSearch;
private transient SearchImplementation deletionEntriesSearch;
private transient SearchImplementation falsePositiveSearch;
private transient SearchImplementation falseRecommendationsSearch;
private final HashMap<String, Set<MatchedRule>> localEntriesWithMatchedRules = new HashMap<>();
private SearchImplementation localSearch;
private transient SearchImplementation localSearch;
/**
@ -85,6 +91,74 @@ public class DictionaryModel implements Cloneable {
}
/**
* Returns the search implementation for non-deleted dictionary entries.
*
* @return The {@link SearchImplementation} for non-deleted dictionary entries.
*/
public SearchImplementation getEntriesSearch() {
if (entriesSearch == null) {
this.entriesSearch = new SearchImplementation(this.entries.stream()
.filter(e -> !e.isDeleted())
.map(DictionaryEntry::getValue)
.collect(Collectors.toList()), caseInsensitive);
}
return entriesSearch;
}
/**
* Returns the search implementation for deleted dictionary entries.
*
* @return The {@link SearchImplementation} for deleted dictionary entries.
*/
public SearchImplementation getDeletionEntriesSearch() {
if (deletionEntriesSearch == null) {
this.deletionEntriesSearch = new SearchImplementation(this.entries.stream()
.filter(DictionaryEntry::isDeleted)
.map(DictionaryEntry::getValue)
.collect(Collectors.toList()), caseInsensitive);
}
return deletionEntriesSearch;
}
/**
* Returns the search implementation for non-deleted false positive entries.
*
* @return The {@link SearchImplementation} for non-deleted false positive entries.
*/
public SearchImplementation getFalsePositiveSearch() {
if (falsePositiveSearch == null) {
this.falsePositiveSearch = new SearchImplementation(this.falsePositives.stream()
.filter(e -> !e.isDeleted())
.map(DictionaryEntry::getValue)
.collect(Collectors.toList()), caseInsensitive);
}
return falsePositiveSearch;
}
/**
* Returns the search implementation for non-deleted false recommendation entries.
*
* @return The {@link SearchImplementation} for non-deleted false recommendation entries.
*/
public SearchImplementation getFalseRecommendationsSearch() {
if (falseRecommendationsSearch == null) {
this.falseRecommendationsSearch = new SearchImplementation(this.falseRecommendations.stream()
.filter(e -> !e.isDeleted())
.map(DictionaryEntry::getValue)
.collect(Collectors.toList()), caseInsensitive);
}
return falseRecommendationsSearch;
}
/**
* Retrieves the matched rules for a given value from the local dictionary entries.
* The value is processed based on the case sensitivity of the dictionary.
@ -98,149 +172,4 @@ public class DictionaryModel implements Cloneable {
return localEntriesWithMatchedRules.get(cleanedValue);
}
@Override
public DictionaryModel clone() {
try {
DictionaryModel cloned = (DictionaryModel) super.clone();
cloned.localSearch = null;
return cloned;
} catch (CloneNotSupportedException e) {
throw new AssertionError("Cloning not supported", e);
}
}
public void addNewEntries(long versionThreshold, Set<DictionaryIncrementValue> newValues) {
getEntries().forEach(entry -> {
if (entry.getVersion() > versionThreshold) {
newValues.add(new DictionaryIncrementValue(entry.getValue(), isCaseInsensitive()));
}
});
getFalsePositives().forEach(entry -> {
if (entry.getVersion() > versionThreshold) {
newValues.add(new DictionaryIncrementValue(entry.getValue(), isCaseInsensitive()));
}
});
getFalseRecommendations().forEach(entry -> {
if (entry.getVersion() > versionThreshold) {
newValues.add(new DictionaryIncrementValue(entry.getValue(), isCaseInsensitive()));
}
});
}
public void handleOldEntries(Type newType,
DictionaryEntries newEntries,
Set<DictionaryEntryModel> combinedEntries,
Set<DictionaryEntryModel> combinedFalsePositives,
Set<DictionaryEntryModel> combinedFalseRecommendations) {
if (isCaseInsensitive() && !newType.isCaseInsensitive()) {
// Compute new entries' values in lowercase once
Set<String> newEntryValuesLower = newEntries.getEntries()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedEntries.addAll(getEntries()
.stream()
.filter(f -> !newEntryValuesLower.contains(f.getValue()))
.collect(Collectors.toSet()));
// Similarly for false positives
Set<String> newFalsePositivesValuesLower = newEntries.getFalsePositives()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedFalsePositives.addAll(getFalsePositives()
.stream()
.filter(f -> !newFalsePositivesValuesLower.contains(f.getValue()))
.collect(Collectors.toSet()));
// Similarly for false recommendations
Set<String> newFalseRecommendationsValuesLower = newEntries.getFalseRecommendations()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedFalseRecommendations.addAll(getFalseRecommendations()
.stream()
.filter(f -> !newFalseRecommendationsValuesLower.contains(f.getValue()))
.collect(Collectors.toSet()));
} else if (!isCaseInsensitive() && newType.isCaseInsensitive()) {
// Compute new entries' values in lowercase once
Set<String> newEntryValuesLower = newEntries.getEntries()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedEntries.addAll(getEntries()
.stream()
.filter(f -> !newEntryValuesLower.contains(f.getValue().toLowerCase(Locale.ROOT)))
.collect(Collectors.toSet()));
// Similarly for false positives
Set<String> newFalsePositivesValuesLower = newEntries.getFalsePositives()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedFalsePositives.addAll(getFalsePositives()
.stream()
.filter(f -> !newFalsePositivesValuesLower.contains(f.getValue().toLowerCase(Locale.ROOT)))
.collect(Collectors.toSet()));
// Similarly for false recommendations
Set<String> newFalseRecommendationsValuesLower = newEntries.getFalseRecommendations()
.stream()
.map(s -> s.getValue().toLowerCase(Locale.ROOT))
.collect(Collectors.toSet());
combinedFalseRecommendations.addAll(getFalseRecommendations()
.stream()
.filter(f -> !newFalseRecommendationsValuesLower.contains(f.getValue().toLowerCase(Locale.ROOT)))
.collect(Collectors.toSet()));
} else {
// Both have the same case sensitivity
Set<String> newEntryValues = newEntries.getEntries()
.stream()
.map(DictionaryEntryModel::getValue)
.collect(Collectors.toSet());
combinedEntries.addAll(getEntries()
.stream()
.filter(f -> !newEntryValues.contains(f.getValue()))
.collect(Collectors.toSet()));
// Similarly for false positives
Set<String> newFalsePositivesValues = newEntries.getFalsePositives()
.stream()
.map(DictionaryEntryModel::getValue)
.collect(Collectors.toSet());
combinedFalsePositives.addAll(getFalsePositives()
.stream()
.filter(f -> !newFalsePositivesValues.contains(f.getValue()))
.collect(Collectors.toSet()));
// Similarly for false recommendations
Set<String> newFalseRecommendationsValues = newEntries.getFalseRecommendations()
.stream()
.map(DictionaryEntryModel::getValue)
.collect(Collectors.toSet());
combinedFalseRecommendations.addAll(getFalseRecommendations()
.stream()
.filter(f -> !newFalseRecommendationsValues.contains(f.getValue()))
.collect(Collectors.toSet()));
}
}
}

View File

@ -1,86 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.List;
import java.util.stream.Stream;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
/**
* Common interface for dictionary search implementations.
*/
public interface DictionarySearch {
/**
* Retrieves a list of match boundaries within the given text.
*
* @param text The text to search within.
* @return A list of MatchTextRange representing the boundaries of matches.
*/
default List<MatchTextRange> getBoundariesAsList(CharSequence text) {
return getBoundaries(text).toList();
}
/**
* Retrieves a stream of match boundaries within the given text.
*
* @param text The text to search within.
* @return A stream of MatchTextRange representing the boundaries of matches.
*/
Stream<MatchTextRange> getBoundaries(CharSequence text);
/**
* Retrieves a list of match boundaries within a specified region of the text.
*
* @param text The text to search within.
* @param region The specific region of the text to search.
* @return A list of MatchTextRange representing the boundaries of matches.
*/
default List<MatchTextRange> getBoundariesAsList(CharSequence text, TextRange region) {
return getBoundaries(text, region).toList();
}
/**
* Retrieves a stream of match boundaries within a specified region of the text.
*
* @param text The text to search within.
* @param region The specific region of the text to search.
* @return A stream of MatchTextRange representing the boundaries of matches.
*/
Stream<MatchTextRange> getBoundaries(CharSequence text, TextRange region);
/**
* Retrieves a stream of match boundaries within the given TextBlock.
*
* @param textBlock The TextBlock to search within.
* @return A stream of MatchTextRange representing the boundaries of matches.
*/
Stream<MatchTextRange> getBoundaries(TextBlock textBlock);
/**
* Retrieves a list of match positions within the given text.
*
* @param text The text to search within.
* @return A list of MatchPosition representing the positions of matches.
*/
default List<MatchPosition> getMatchesAsList(String text) {
return getMatches(text).toList();
}
/**
* Retrieves a stream of match positions within the given text.
*
* @param text The text to search within.
* @return A stream of MatchPosition representing the positions of matches.
*/
Stream<MatchPosition> getMatches(String text);
/**
* Record representing the range of matched text along with its identifier.
*/
record MatchTextRange(DictionaryIdentifier identifier, TextRange textRange) {}
/**
* Record representing the start and end positions of a match along with its identifier.
*/
record MatchPosition(DictionaryIdentifier identifier, int startIndex, int endIndex) {}
}

View File

@ -1,31 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.List;
import java.util.Map;
import com.hankcs.algorithm.AhoCorasickDoubleArrayTrie;
public class DoubleArrayTrieDictionarySearch extends AbstractDictionarySearch {
private final AhoCorasickDoubleArrayTrie<List<DictionaryIdentifierWithKeyword>> trie;
public DoubleArrayTrieDictionarySearch(Map<String, List<DictionaryIdentifierWithKeyword>> keyWordToIdentifiersMap) {
super(keyWordToIdentifiersMap);
trie = new AhoCorasickDoubleArrayTrie<>();
trie.build(keyWordToIdentifiersMap);
}
@Override
protected void parseText(CharSequence text, HitHandler handler) {
List<AhoCorasickDoubleArrayTrie.Hit<List<DictionaryIdentifierWithKeyword>>> hits = trie.parseText(text);
for (AhoCorasickDoubleArrayTrie.Hit<List<DictionaryIdentifierWithKeyword>> hit : hits) {
handler.handle(hit.begin, hit.end, hit.value);
}
}
}

View File

@ -1,138 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
public class DoubleTrieDictionarySearch implements DictionarySearch {
private final Map<DictionaryIdentifier, List<String>> caseSensitiveEntries = new HashMap<>();
private final Map<DictionaryIdentifier, List<String>> caseInsensitiveEntries = new HashMap<>();
private final DictionaryIdentifierTrie caseSensitiveTrie;
private final DictionaryIdentifierTrie caseInsensitiveTrie;
public DoubleTrieDictionarySearch(Map<DictionaryIdentifier, List<String>> dictionaryValues) {
for (Map.Entry<DictionaryIdentifier, List<String>> entry : dictionaryValues.entrySet()) {
DictionaryIdentifier identifier = entry.getKey();
List<String> values = entry.getValue();
if (identifier.caseSensitive()) {
caseSensitiveEntries.put(identifier, values);
} else {
caseInsensitiveEntries.put(identifier, values);
}
}
this.caseSensitiveTrie = createTrie(caseSensitiveEntries, false);
this.caseInsensitiveTrie = createTrie(caseInsensitiveEntries, true);
}
private DictionaryIdentifierTrie createTrie(Map<DictionaryIdentifier, List<String>> entries, boolean ignoreCase) {
if (entries.isEmpty()) {
return null;
}
DictionaryIdentifierTrie.DictionaryIdentifierTrieBuilder builder = new DictionaryIdentifierTrie.DictionaryIdentifierTrieBuilder();
if (ignoreCase) {
builder.ignoreCase();
}
entries.forEach((identifier, values) -> {
for (String value : values) {
builder.addKeyword(value, identifier);
}
});
return builder.build();
}
public boolean atLeastOneMatches(String text) {
if (!caseSensitiveEntries.isEmpty() && caseSensitiveTrie != null && caseSensitiveTrie.containsMatch(text)) {
return true;
}
return !caseInsensitiveEntries.isEmpty() && caseInsensitiveTrie != null && caseInsensitiveTrie.containsMatch(text);
}
@Override
public Stream<MatchTextRange> getBoundaries(CharSequence text) {
List<MatchTextRange> matches = new ArrayList<>();
addMatchTextRangesForTrie(caseSensitiveEntries, caseSensitiveTrie, matches, text);
addMatchTextRangesForTrie(caseInsensitiveEntries, caseInsensitiveTrie, matches, text);
return matches.stream();
}
@Override
public Stream<MatchTextRange> getBoundaries(TextBlock textBlock) {
return getBoundaries(textBlock, textBlock.getTextRange());
}
@Override
public Stream<MatchTextRange> getBoundaries(CharSequence text, TextRange region) {
List<MatchTextRange> matches = new ArrayList<>();
addMatchTextRangesForTrie(text, region, matches, caseSensitiveEntries, caseSensitiveTrie);
addMatchTextRangesForTrie(text, region, matches, caseInsensitiveEntries, caseInsensitiveTrie);
return matches.stream();
}
@Override
public Stream<MatchPosition> getMatches(String text) {
List<MatchPosition> matches = new ArrayList<>();
addMatchPositionsForTrie(caseSensitiveEntries, caseSensitiveTrie, matches, text);
addMatchPositionsForTrie(caseInsensitiveEntries, caseInsensitiveTrie, matches, text);
return matches.stream();
}
private void addMatchTextRangesForTrie(Map<DictionaryIdentifier, List<String>> entries, DictionaryIdentifierTrie trie, List<MatchTextRange> matches, CharSequence text) {
if (!entries.isEmpty() && trie != null) {
matches.addAll(trie.parseText(text)
.stream()
.map(r -> new MatchTextRange(r.getPayload(), new TextRange(r.getStart(), r.getEnd() + 1)))
.toList());
}
}
private void addMatchTextRangesForTrie(CharSequence text,
TextRange region,
List<MatchTextRange> matches,
Map<DictionaryIdentifier, List<String>> entries,
DictionaryIdentifierTrie trie) {
if (!entries.isEmpty() && trie != null) {
CharSequence subSequence = text.subSequence(region.start(), region.end());
matches.addAll(trie.parseText(subSequence)
.stream()
.map(r -> new MatchTextRange(r.getPayload(), new TextRange(r.getStart() + region.start(), r.getEnd() + region.start() + 1)))
.toList());
}
}
private void addMatchPositionsForTrie(Map<DictionaryIdentifier, List<String>> entries, DictionaryIdentifierTrie trie, List<MatchPosition> matches, String text) {
if (!entries.isEmpty() && trie != null) {
matches.addAll(trie.parseText(text)
.stream()
.map(r -> new MatchPosition(r.getPayload(), r.getStart(), r.getEnd() + 1))
.toList());
}
}
}

View File

@ -11,7 +11,6 @@ import java.util.stream.Stream;
import org.ahocorasick.trie.Trie;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import lombok.Data;
@ -105,12 +104,6 @@ public class SearchImplementation {
}
public Stream<TextRange> getBoundaries(TextBlock textBlock) {
return getBoundaries(textBlock, textBlock.getTextRange());
}
public Stream<TextRange> getBoundaries(CharSequence text, TextRange region) {
if (this.values.isEmpty()) {

View File

@ -1,46 +0,0 @@
package com.iqser.red.service.redaction.v1.server.model.dictionary;
import java.util.Locale;
import lombok.Getter;
public class TextContext {
private final CharSequence text;
@Getter
private final String lowerText;
private final int offset;
TextContext(CharSequence text, int offset) {
this.text = text;
this.lowerText = text.toString().toLowerCase(Locale.ROOT);
this.offset = offset;
}
TextContext(CharSequence text) {
this(text, 0);
}
public int getStart(int hitBegin) {
return hitBegin + offset;
}
public int getEnd(int hitEnd) {
return hitEnd + offset;
}
public String getMatchedText(int hitBegin, int hitEnd) {
return text.subSequence(hitBegin, hitEnd).toString();
}
}

View File

@ -0,0 +1,29 @@
package com.iqser.red.service.redaction.v1.server.model.document;
import java.io.Serializable;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPage;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentPositionData;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentStructure;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.DocumentTextData;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.FieldDefaults;
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
@FieldDefaults(level = AccessLevel.PRIVATE)
public class DocumentData implements Serializable {
DocumentPage[] documentPages;
DocumentTextData[] documentTextData;
DocumentPositionData[] documentPositionData;
DocumentStructure documentStructure;
}

View File

@ -9,8 +9,6 @@ import java.util.List;
import java.util.Optional;
import java.util.stream.Stream;
import com.iqser.red.service.redaction.v1.server.model.document.entity.EntityType;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Document;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.GenericSemanticNode;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.NodeType;
@ -19,8 +17,6 @@ import com.iqser.red.service.redaction.v1.server.model.document.nodes.Table;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.TableCell;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlockCollector;
import com.iqser.red.service.redaction.v1.server.utils.EntityCreationUtility;
import com.iqser.red.service.redaction.v1.server.utils.EntityEnrichmentService;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
@ -39,7 +35,7 @@ public class DocumentTree {
public DocumentTree(Document document) {
this.root = Entry.builder().treeId(Collections.emptyList()).children(new LinkedList<>()).node(document).build();
root = Entry.builder().treeId(Collections.emptyList()).children(new LinkedList<>()).node(document).build();
}
@ -300,22 +296,6 @@ public class DocumentTree {
}
public Optional<Entry> findEntryById(List<Integer> treeId) {
if (treeId.isEmpty()) {
return Optional.of(root);
}
Entry entry = root;
for (int id : treeId) {
if (id < 0 || id >= entry.children.size()) {
return Optional.empty();
}
entry = entry.children.get(id);
}
return Optional.of(entry);
}
public Stream<Entry> mainEntries() {
return root.children.stream();
@ -362,25 +342,6 @@ public class DocumentTree {
}
public void addEntityToGraph(TextEntity entity) {
getRoot().getNode().addThisToEntityIfIntersects(entity);
TextBlock textBlock = entity.getDeepestFullyContainingNode().getTextBlock();
EntityEnrichmentService.enrichEntity(entity, textBlock);
EntityCreationUtility.addToPages(entity);
EntityCreationUtility.addEntityToNodeEntitySets(entity);
if (entity.getEntityType().equals(EntityType.TEMPORARY)) {
return;
}
entity.computeRelations();
entity.notifyEntityInserted();
}
@Builder
@Getter
@AllArgsConstructor

View File

@ -134,12 +134,6 @@ public class TextRange implements Comparable<TextRange> {
}
public boolean containsExclusive(int index) {
return start <= index && index < end;
}
/**
* Checks if this {@link TextRange} intersects with another {@link TextRange}.
*

View File

@ -6,6 +6,7 @@ import java.util.PriorityQueue;
import java.util.Set;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.drools.RuleIdentifier;
import lombok.NonNull;
@ -51,17 +52,6 @@ public interface IEntity {
String type();
/**
* An Entity is valid, when it active and not a false recommendation, a false positive or a dictionary removal.
*
* @return true, if the entity is valid, false otherwise/
*/
default boolean valid() {
return active();
}
/**
* Calculates the length of the entity's value.
*
@ -95,9 +85,6 @@ public interface IEntity {
// Don't use default accessor pattern (e.g. isApplied()), as it might lead to errors in drools due to property-specific optimization of the drools planner.
default boolean applied() {
if (this.getMatchedRule().isHigherPriorityThanManual()) {
return getMatchedRule().isApplied();
}
return getManualOverwrite().getApplied()
.orElse(getMatchedRule().isApplied());
}
@ -121,10 +108,6 @@ public interface IEntity {
*/
default boolean ignored() {
if (this.getMatchedRule().isHigherPriorityThanManual()) {
return getMatchedRule().isIgnored();
}
return getManualOverwrite().getIgnored()
.orElse(getMatchedRule().isIgnored());
}
@ -137,9 +120,6 @@ public interface IEntity {
*/
default boolean removed() {
if (this.getMatchedRule().isHigherPriorityThanManual()) {
return getMatchedRule().isRemoved();
}
return getManualOverwrite().getRemoved()
.orElse(getMatchedRule().isRemoved());
}
@ -152,9 +132,6 @@ public interface IEntity {
*/
default boolean resized() {
if (this.getMatchedRule().isHigherPriorityThanManual()) {
return getMatchedRule().isRemoved();
}
return getManualOverwrite().getResized()
.orElse(false);
}
@ -339,9 +316,7 @@ public interface IEntity {
*/
default void addMatchedRule(MatchedRule matchedRule) {
boolean wasValid = valid();
getMatchedRuleList().add(matchedRule);
handleStateChange(wasValid);
}
@ -355,53 +330,7 @@ public interface IEntity {
if (getMatchedRuleList().equals(matchedRules)) {
return;
}
boolean wasValid = valid();
getMatchedRuleList().addAll(matchedRules);
handleStateChange(wasValid);
}
void addEntityEventListener(EntityEventListener listener);
void removeEntityEventListener(EntityEventListener listener);
default void notifyEntityInserted() {
for (EntityEventListener listener : getEntityEventListeners()) {
listener.onEntityInserted(this);
}
}
default void notifyEntityUpdated() {
for (EntityEventListener listener : getEntityEventListeners()) {
listener.onEntityUpdated(this);
}
}
default void notifyEntityRemoved() {
for (EntityEventListener listener : getEntityEventListeners()) {
listener.onEntityRemoved(this);
}
}
Collection<EntityEventListener> getEntityEventListeners();
default void handleStateChange(boolean wasValid) {
if (valid() == wasValid) {
return;
}
if (!removed()) {
notifyEntityUpdated();
} else {
notifyEntityRemoved();
}
}
@ -435,9 +364,15 @@ public interface IEntity {
*
* @return The built reason string.
*/
default String buildReason() {
default String buildReasonWithManualChangeDescriptions() {
return getMatchedRule().getReason();
if (getManualOverwrite().getDescriptions().isEmpty()) {
return getMatchedRule().getReason();
}
if (getMatchedRule().getReason().isEmpty()) {
return String.join(", ", getManualOverwrite().getDescriptions());
}
return getMatchedRule().getReason() + ", " + String.join(", ", getManualOverwrite().getDescriptions());
}

View File

@ -1,8 +1,10 @@
package com.iqser.red.service.redaction.v1.server.model.document.entity;
import java.util.Collections;
import java.util.Comparator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
@ -12,6 +14,7 @@ import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRecategorization;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualRedactionEntry;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.ManualResizeRedaction;
import com.iqser.red.service.redaction.v1.server.model.RectangleWithPage;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
@ -23,9 +26,18 @@ import lombok.experimental.FieldDefaults;
@FieldDefaults(level = AccessLevel.PRIVATE)
public class ManualChangeOverwrite {
private static final Map<Class<? extends BaseAnnotation>, String> MANUAL_CHANGE_DESCRIPTIONS = Map.of(//
ManualRedactionEntry.class, "created by manual change", //
ManualLegalBasisChange.class, "legal basis was manually changed", //
ManualResizeRedaction.class, "resized by manual override", //
ManualForceRedaction.class, "forced by manual override", //
IdRemoval.class, "removed by manual override", //
ManualRecategorization.class, "recategorized by manual override");
@Builder.Default
List<BaseAnnotation> manualChanges = new LinkedList<>();
boolean changed;
List<String> descriptions;
String type;
String legalBasis;
String section;
@ -51,7 +63,6 @@ public class ManualChangeOverwrite {
this.manualChanges = new LinkedList<>();
}
public ManualChangeOverwrite(EntityType entityType, String section) {
this(entityType);
@ -84,6 +95,8 @@ public class ManualChangeOverwrite {
private void updateFields(List<BaseAnnotation> sortedManualChanges) {
descriptions = new LinkedList<>();
for (BaseAnnotation manualChange : sortedManualChanges) {
// ManualRedactionEntries are created prior to rule execution in analysis service.
@ -138,6 +151,8 @@ public class ManualChangeOverwrite {
legalBasis = recategorization.getLegalBasis();
}
}
descriptions.add(MANUAL_CHANGE_DESCRIPTIONS.get(manualChange.getClass()));
}
changed = false;
}
@ -230,6 +245,13 @@ public class ManualChangeOverwrite {
}
public List<String> getDescriptions() {
calculateCurrentOverride();
return descriptions == null ? Collections.emptyList() : descriptions;
}
public Optional<List<RectangleWithPage>> getPositions() {
calculateCurrentOverride();

View File

@ -5,6 +5,9 @@ import java.util.List;
import java.util.Objects;
import java.util.Set;
import com.iqser.red.service.redaction.v1.server.model.drools.RuleIdentifier;
import com.iqser.red.service.redaction.v1.server.model.drools.RuleType;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;
import lombok.Builder;
@ -25,9 +28,8 @@ public final class MatchedRule implements Comparable<MatchedRule> {
public static final RuleType FINAL_TYPE = RuleType.fromString("FINAL");
public static final RuleType ELIMINATION_RULE_TYPE = RuleType.fromString("X");
public static final RuleType IMPORTED_TYPE = RuleType.fromString("IMP");
public static final RuleType MANUAL_TYPE = RuleType.fromString("MAN");
public static final RuleType DICTIONARY_TYPE = RuleType.fromString("DICT");
private static final List<RuleType> RULE_TYPE_PRIORITIES = List.of(FINAL_TYPE, ELIMINATION_RULE_TYPE, MANUAL_TYPE, IMPORTED_TYPE, DICTIONARY_TYPE);
private static final List<RuleType> RULE_TYPE_PRIORITIES = List.of(FINAL_TYPE, ELIMINATION_RULE_TYPE, IMPORTED_TYPE, DICTIONARY_TYPE);
RuleIdentifier ruleIdentifier;
@Builder.Default
@ -55,13 +57,6 @@ public final class MatchedRule implements Comparable<MatchedRule> {
}
public boolean isHigherPriorityThanManual() {
return (-1 < RULE_TYPE_PRIORITIES.indexOf(this.ruleIdentifier.type())) && (RULE_TYPE_PRIORITIES.indexOf(this.ruleIdentifier.type()) < RULE_TYPE_PRIORITIES.indexOf(
MANUAL_TYPE));
}
/**
* Returns a modified instance of {@link MatchedRule} based on its applied status.
* If the rule has been applied, it returns a new {@link MatchedRule} instance that retains all properties of the original

View File

@ -4,7 +4,6 @@ import java.awt.geom.Rectangle2D;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
@ -12,10 +11,7 @@ import java.util.Map;
import java.util.PriorityQueue;
import java.util.Set;
import org.apache.commons.collections4.map.HashedMap;
import com.iqser.red.service.persistence.service.v1.api.shared.model.analysislog.entitylog.Engine;
import com.iqser.red.service.persistence.service.v1.api.shared.model.annotations.entitymapped.BaseAnnotation;
import com.iqser.red.service.redaction.v1.server.model.document.TextRange;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.Page;
import com.iqser.red.service.redaction.v1.server.model.document.nodes.SemanticNode;
@ -28,10 +24,6 @@ import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.FieldDefaults;
/**
* Represents a text entity within a document, characterized by its text range, type, entity type,
* and associated metadata like matched rules, pages, and engines.
*/
@Data
@Builder
@AllArgsConstructor
@ -47,14 +39,13 @@ public class TextEntity implements IEntity {
TextRange textRange;
@Builder.Default
Set<TextRange> duplicateTextRanges = new HashSet<>();
List<TextRange> duplicateTextRanges = new ArrayList<>();
String type; // TODO: make final once ManualChangesApplicationService::recategorize is deleted
final EntityType entityType;
@Builder.Default
final PriorityQueue<MatchedRule> matchedRuleList = new PriorityQueue<>();
@Builder.Default
final ManualChangeOverwrite manualOverwrite = new ManualChangeOverwrite();
final ManualChangeOverwrite manualOverwrite;
boolean dictionaryEntry;
boolean dossierDictionaryEntry;
@ -73,12 +64,6 @@ public class TextEntity implements IEntity {
List<SemanticNode> intersectingNodes = new LinkedList<>();
SemanticNode deepestFullyContainingNode;
@Builder.Default
Map<TextEntity, Set<Relation>> relations = new HashMap<>();
@Builder.Default
Collection<EntityEventListener> entityEventListeners = new ArrayList<>();
public static TextEntity initialEntityNode(TextRange textRange, String type, EntityType entityType, SemanticNode node) {
@ -169,15 +154,12 @@ public class TextEntity implements IEntity {
public void removeFromGraph() {
remove("FINAL.0.0", "removed completely");
intersectingNodes.forEach(node -> node.getEntities().remove(this));
pages.forEach(page -> page.getEntities().remove(this));
intersectingNodes = new LinkedList<>();
relations.keySet()
.forEach(entity -> entity.getRelations().remove(this));
relations = new HashedMap<>();
deepestFullyContainingNode = null;
pages = new HashSet<>();
remove("FINAL.0.0", "removed completely");
}
@ -212,34 +194,28 @@ public class TextEntity implements IEntity {
public boolean containedBy(TextEntity textEntity) {
return textEntity.contains(this);
return this.textRange.containedBy(textEntity.getTextRange()) //
|| duplicateTextRanges.stream()
.anyMatch(duplicateTextRange -> duplicateTextRange.containedBy(textEntity.textRange)) //
|| duplicateTextRanges.stream()
.anyMatch(duplicateTextRange -> textEntity.getDuplicateTextRanges()
.stream()
.anyMatch(duplicateTextRange::containedBy));
}
public boolean contains(TextEntity textEntity) {
if (this.textRange.contains(textEntity.getTextRange())) {
return true;
}
Set<TextRange> textEntityDuplicateRanges = textEntity.getDuplicateTextRanges();
for (TextRange duplicateTextRange : this.duplicateTextRanges) {
if (duplicateTextRange.contains(textEntity.getTextRange())) {
return true;
}
for (TextRange otherRange : textEntityDuplicateRanges) {
if (duplicateTextRange.contains(otherRange)) {
return true;
}
}
}
return false;
return this.textRange.contains(textEntity.getTextRange()) //
|| duplicateTextRanges.stream()
.anyMatch(duplicateTextRange -> duplicateTextRange.contains(textEntity.textRange)) //
|| duplicateTextRanges.stream()
.anyMatch(duplicateTextRange -> textEntity.getDuplicateTextRanges()
.stream()
.anyMatch(duplicateTextRange::contains));
}
public boolean intersects(TextEntity textEntity) {
return this.textRange.intersects(textEntity.getTextRange()) //
@ -264,20 +240,6 @@ public class TextEntity implements IEntity {
}
public void addManualChange(BaseAnnotation manualChange) {
manualOverwrite.addChange(manualChange);
notifyEntityUpdated();
}
public void addManualChanges(List<BaseAnnotation> manualChanges) {
manualOverwrite.addChanges(manualChanges);
notifyEntityUpdated();
}
public boolean matchesAnnotationId(String manualRedactionId) {
return getPositionsOnPagePerPage().stream()
@ -316,21 +278,6 @@ public class TextEntity implements IEntity {
}
/**
* @return true when this entity is of EntityType ENTITY or HINT
*/
public boolean validEntityType() {
return entityType.equals(EntityType.ENTITY) || entityType.equals(EntityType.HINT);
}
public boolean valid() {
return active() && validEntityType();
}
@Override
public String value() {
@ -338,42 +285,4 @@ public class TextEntity implements IEntity {
.orElse(getMatchedRule().isWriteValueWithLineBreaks() ? getValueWithLineBreaks() : value);
}
@Override
public void addEntityEventListener(EntityEventListener listener) {
entityEventListeners.add(listener);
}
@Override
public void removeEntityEventListener(EntityEventListener listener) {
entityEventListeners.remove(listener);
}
public void computeRelations() {
for (TextEntity textEntity : this.getDeepestFullyContainingNode().getEntities()) {
if (this.intersects(textEntity) && !this.equals(textEntity) && !textEntity.getEntityType().equals(EntityType.TEMPORARY)) {
if (textEntity.getTextRange().equals(this.getTextRange())) {
textEntity.getRelations().computeIfAbsent(this, k -> new HashSet<>()).add(new Equality(this, textEntity));
this.getRelations().computeIfAbsent(textEntity, k -> new HashSet<>()).add(new Equality(textEntity, this));
} else if (textEntity.containedBy(this)) {
textEntity.getRelations().computeIfAbsent(this, k -> new HashSet<>()).add(new Intersection(textEntity, this));
this.getRelations().computeIfAbsent(textEntity, k -> new HashSet<>()).add(new Containment(this, textEntity));
} else if (this.containedBy(textEntity)) {
textEntity.getRelations().computeIfAbsent(this, k -> new HashSet<>()).add(new Containment(textEntity, this));
this.getRelations().computeIfAbsent(textEntity, k -> new HashSet<>()).add(new Intersection(this, textEntity));
} else {
textEntity.getRelations().computeIfAbsent(this, k -> new HashSet<>()).add(new Intersection(textEntity, this));
this.getRelations().computeIfAbsent(textEntity, k -> new HashSet<>()).add(new Intersection(this, textEntity));
}
}
}
}
}

View File

@ -9,6 +9,7 @@ import java.util.Set;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentTree;
import com.iqser.red.service.redaction.v1.server.model.document.entity.TextEntity;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import com.knecon.fforesight.service.layoutparser.internal.api.data.redaction.LayoutEngine;
import lombok.AccessLevel;
import lombok.AllArgsConstructor;

View File

@ -10,7 +10,6 @@ import java.util.stream.Collectors;
import java.util.stream.Stream;
import com.iqser.red.service.redaction.v1.server.model.document.DocumentTree;
import com.iqser.red.service.redaction.v1.server.model.document.NodeVisitor;
import com.iqser.red.service.redaction.v1.server.model.document.textblock.TextBlock;
import lombok.AccessLevel;
@ -39,6 +38,7 @@ public class Document extends AbstractSemanticNode {
@Builder.Default
static final SectionIdentifier sectionIdentifier = SectionIdentifier.document();
@Override
public NodeType getType() {
@ -63,8 +63,8 @@ public class Document extends AbstractSemanticNode {
*
* @return A list of main sections within the document
* @deprecated This method is marked for removal.
* Use {@link #streamChildrenOfType(NodeType)} instead,
* or {@link #getChildrenOfTypeSectionOrSuperSection()} which returns children of type SECTION as well as SUPER_SECTION.
* Use {@link #streamChildrenOfType(NodeType)} instead,
* or {@link #getChildrenOfTypeSectionOrSuperSection()} which returns children of type SECTION as well as SUPER_SECTION.
*/
@Deprecated(forRemoval = true)
public List<Section> getMainSections() {
@ -168,11 +168,4 @@ public class Document extends AbstractSemanticNode {
return bBox;
}
@Override
public void accept(NodeVisitor visitor) {
visitor.visit(this);
}
}

Some files were not shown because too many files have changed in this diff Show More