Viktor Seifert e3abf2be0f Pull request #44: RED-4653
Merge in RR/pyinfra from RED-4653 to master

Squashed commit of the following:

commit 14ed6d2ee79f9a6bc4bad187dc775f7476a05d97
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 11:08:16 2022 +0200

    RED-4653: Disabled coverage check since there not tests at the moment

commit e926631b167d03e8cc0867db5b5c7d44d6612dcf
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 10:58:50 2022 +0200

    RED-4653: Re-added test execution scripts

commit 94648cc449bbc392864197a1796f99f8953b7312
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 10:50:42 2022 +0200

    RED-4653: Changed error case for processing messages to not requeue the message since that will be handled in DLQ logic

commit d77982dfedcec49482293d79818283c8d7a17dc7
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 10:46:32 2022 +0200

    RED-4653: Removed unnecessary logging message

commit 8c00fd75bf04f8ecc0e9cda654f8e053d4cfb66f
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 10:03:35 2022 +0200

    RED-4653: Re-added wrongly removed config

commit 759d72b3fa093b19f97e68d17bf53390cd5453c7
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 09:57:47 2022 +0200

    RED-4653: Removed leftover Docker commands

commit 2ff5897ee38e39d6507278b6a82176be2450da16
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 09:48:08 2022 +0200

    RED-4653: Removed leftover Docker config

commit 1074167aa98f9f59c0f0f534ba2f1ba09ffb0958
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Tue Jul 26 09:41:21 2022 +0200

    RED-4653: Removed Docker build stage since it is not needed for a project that is used as a Python module

commit ec769c6cd74a74097d8ebe4800ea6e2ea86236cc
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Mon Jul 25 16:11:50 2022 +0200

    RED-4653: Renamed function for better clarity and consistency

commit 96e8ac4316ac57aac90066f35422d333c532513b
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Mon Jul 25 15:07:40 2022 +0200

    RED-4653: Added code to cancel the queue subscription on application exit to queue manager so that it can exit gracefully

commit 64d8e0bd15730898274c08d34f9c34fbac559422
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Mon Jul 25 13:57:06 2022 +0200

    RED-4653: Moved queue cancellation to a separate method so that it can be called on application exit

commit aff1d06364f5694c5922f37d961e401c12243221
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Mon Jul 25 11:51:16 2022 +0200

    RED-4653: Re-ordered message processing so that ack occurs after publishing the result, to prevent message loss

commit 9339186b86f2fe9653366c22fcdc9f7fc096b138
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 18:07:25 2022 +0200

    RED-4653: RED-4653: Reordered code to acknowledge message before publishing a result message

commit 2d6fe1cbd95cd86832b086c6dfbcfa62b3ffa16f
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 17:00:04 2022 +0200

    RED-4653: Hopefully corrected storage bucket env var name

commit 8f1ef0dd5532882cb12901721195d9acb336286c
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 16:37:27 2022 +0200

    RED-4653: Switched to validating the connection url via a regex since the validators lib parses our endpoints incorrectly

commit 8d0234fcc5ff7ed1ae7695a17856c6af050065bd
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 15:02:54 2022 +0200

    RED-4653: Corrected exception creation

commit 098a62335b3b695ee409363d429ac07284de7138
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 14:42:22 2022 +0200

    RED-4653: Added a descriptive error message when the storage endpoint is nor a correct url

commit 379685f964a4de641ce6506713f1ea8914a3f5ab
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Fri Jul 22 14:11:48 2022 +0200

    RED-4653: Removed variable re-use to make the code clearer

commit 4bf1a023453635568e16b1678ef5ad994c534045
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Thu Jul 21 17:41:55 2022 +0200

    RED-4653: Added explicit conversion of the heartbeat config value to an int before passing it to pika

commit 8f2bc4e028aafdef893458d1433a05724f534fce
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date:   Mon Jul 18 16:41:31 2022 +0200

    RED-4653: Set heartbeat to lower value so that disconnects are detected more quickly

... and 6 more commits
2022-07-26 13:15:07 +02:00
2022-07-26 13:15:07 +02:00
2022-07-26 13:15:07 +02:00
2022-07-26 13:15:07 +02:00
2022-07-26 13:15:07 +02:00
2022-03-15 15:05:14 +01:00
2022-07-26 13:15:07 +02:00
2022-07-26 13:15:07 +02:00
2022-02-16 15:49:38 +01:00

Infrastructure to deploy Research Projects

The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.

Configuration

A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.

Environment Variable Default Description
LOGGING_LEVEL_ROOT DEBUG Logging level for service logger
PROBING_WEBSERVER_HOST "0.0.0.0" Probe webserver address
PROBING_WEBSERVER_PORT 8080 Probe webserver port
PROBING_WEBSERVER_MODE production Webserver mode: {development, production}
RABBITMQ_HOST localhost RabbitMQ host address
RABBITMQ_PORT 5672 RabbitMQ host port
RABBITMQ_USERNAME user RabbitMQ username
RABBITMQ_PASSWORD bitnami RabbitMQ password
RABBITMQ_HEARTBEAT 7200 Controls AMQP heartbeat timeout in seconds
REQUEST_QUEUE request_queue Requests to service
RESPONSE_QUEUE response_queue Responses by service
DEAD_LETTER_QUEUE dead_letter_queue Messages that failed to process
ANALYSIS_ENDPOINT "http://127.0.0.1:5000" Endpoint for analysis container
STORAGE_BACKEND s3 The type of storage to use {s3, azure}
STORAGE_BUCKET "pyinfra-test-bucket" The bucket / container to pull files specified in queue requests from
STORAGE_ENDPOINT "http://127.0.0.1:9000" Endpoint for s3 storage
STORAGE_KEY root User for s3 storage
STORAGE_SECRET password Password for s3 storage
STORAGE_AZURECONNECTIONSTRING "DefaultEndpointsProtocol=..." Connection string for Azure storage

Response Format

Expected AMQP input message:

{
   "dossierId": "",
   "fileId": "",
   "targetFileExtension": "",
   "responseFileExtension": ""
}

Optionally, the input message can contain a field with the key "operations".

AMQP output message:

{
  "dossierId": "",
  "fileId": "",
   ...
}

Development

Either run src/serve.py or the built Docker image.

Setup

Install module.

 pip install -e .
 pip install -r requirements.txt

or build docker image.

docker build -f Dockerfile -t pyinfra .

Usage

Shell 1: Start a MinIO and a RabbitMQ docker container.

docker-compose up

Shell 2: Add files to the local minio storage.

python scripts/manage_minio.py add <MinIO target folder> -d path/to/a/folder/with/PDFs

Shell 2: Run pyinfra-server.

python src/serve.py    

or as container:

docker run --net=host pyinfra

Shell 3: Run analysis-container.

Shell 4: Start a client that sends requests to process PDFs from the MinIO store and annotates these PDFs according to the service responses.

python scripts/mock_client.py
Description
Infrastructure container for analysis container
Readme 3.2 MiB
Release 4.1.0 Latest
2025-01-22 12:38:26 +01:00
Languages
Python 96.7%
Makefile 2%
Shell 1.3%