Merge in RR/pyinfra from RED-4653 to master
Squashed commit of the following:
commit 14ed6d2ee79f9a6bc4bad187dc775f7476a05d97
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 11:08:16 2022 +0200
RED-4653: Disabled coverage check since there not tests at the moment
commit e926631b167d03e8cc0867db5b5c7d44d6612dcf
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 10:58:50 2022 +0200
RED-4653: Re-added test execution scripts
commit 94648cc449bbc392864197a1796f99f8953b7312
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 10:50:42 2022 +0200
RED-4653: Changed error case for processing messages to not requeue the message since that will be handled in DLQ logic
commit d77982dfedcec49482293d79818283c8d7a17dc7
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 10:46:32 2022 +0200
RED-4653: Removed unnecessary logging message
commit 8c00fd75bf04f8ecc0e9cda654f8e053d4cfb66f
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 10:03:35 2022 +0200
RED-4653: Re-added wrongly removed config
commit 759d72b3fa093b19f97e68d17bf53390cd5453c7
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 09:57:47 2022 +0200
RED-4653: Removed leftover Docker commands
commit 2ff5897ee38e39d6507278b6a82176be2450da16
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 09:48:08 2022 +0200
RED-4653: Removed leftover Docker config
commit 1074167aa98f9f59c0f0f534ba2f1ba09ffb0958
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Tue Jul 26 09:41:21 2022 +0200
RED-4653: Removed Docker build stage since it is not needed for a project that is used as a Python module
commit ec769c6cd74a74097d8ebe4800ea6e2ea86236cc
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Mon Jul 25 16:11:50 2022 +0200
RED-4653: Renamed function for better clarity and consistency
commit 96e8ac4316ac57aac90066f35422d333c532513b
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Mon Jul 25 15:07:40 2022 +0200
RED-4653: Added code to cancel the queue subscription on application exit to queue manager so that it can exit gracefully
commit 64d8e0bd15730898274c08d34f9c34fbac559422
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Mon Jul 25 13:57:06 2022 +0200
RED-4653: Moved queue cancellation to a separate method so that it can be called on application exit
commit aff1d06364f5694c5922f37d961e401c12243221
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Mon Jul 25 11:51:16 2022 +0200
RED-4653: Re-ordered message processing so that ack occurs after publishing the result, to prevent message loss
commit 9339186b86f2fe9653366c22fcdc9f7fc096b138
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 18:07:25 2022 +0200
RED-4653: RED-4653: Reordered code to acknowledge message before publishing a result message
commit 2d6fe1cbd95cd86832b086c6dfbcfa62b3ffa16f
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 17:00:04 2022 +0200
RED-4653: Hopefully corrected storage bucket env var name
commit 8f1ef0dd5532882cb12901721195d9acb336286c
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 16:37:27 2022 +0200
RED-4653: Switched to validating the connection url via a regex since the validators lib parses our endpoints incorrectly
commit 8d0234fcc5ff7ed1ae7695a17856c6af050065bd
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 15:02:54 2022 +0200
RED-4653: Corrected exception creation
commit 098a62335b3b695ee409363d429ac07284de7138
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 14:42:22 2022 +0200
RED-4653: Added a descriptive error message when the storage endpoint is nor a correct url
commit 379685f964a4de641ce6506713f1ea8914a3f5ab
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 22 14:11:48 2022 +0200
RED-4653: Removed variable re-use to make the code clearer
commit 4bf1a023453635568e16b1678ef5ad994c534045
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Thu Jul 21 17:41:55 2022 +0200
RED-4653: Added explicit conversion of the heartbeat config value to an int before passing it to pika
commit 8f2bc4e028aafdef893458d1433a05724f534fce
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Mon Jul 18 16:41:31 2022 +0200
RED-4653: Set heartbeat to lower value so that disconnects are detected more quickly
... and 6 more commits
Infrastructure to deploy Research Projects
The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.
Configuration
A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.
| Environment Variable | Default | Description |
|---|---|---|
| LOGGING_LEVEL_ROOT | DEBUG | Logging level for service logger |
| PROBING_WEBSERVER_HOST | "0.0.0.0" | Probe webserver address |
| PROBING_WEBSERVER_PORT | 8080 | Probe webserver port |
| PROBING_WEBSERVER_MODE | production | Webserver mode: {development, production} |
| RABBITMQ_HOST | localhost | RabbitMQ host address |
| RABBITMQ_PORT | 5672 | RabbitMQ host port |
| RABBITMQ_USERNAME | user | RabbitMQ username |
| RABBITMQ_PASSWORD | bitnami | RabbitMQ password |
| RABBITMQ_HEARTBEAT | 7200 | Controls AMQP heartbeat timeout in seconds |
| REQUEST_QUEUE | request_queue | Requests to service |
| RESPONSE_QUEUE | response_queue | Responses by service |
| DEAD_LETTER_QUEUE | dead_letter_queue | Messages that failed to process |
| ANALYSIS_ENDPOINT | "http://127.0.0.1:5000" | Endpoint for analysis container |
| STORAGE_BACKEND | s3 | The type of storage to use {s3, azure} |
| STORAGE_BUCKET | "pyinfra-test-bucket" | The bucket / container to pull files specified in queue requests from |
| STORAGE_ENDPOINT | "http://127.0.0.1:9000" | Endpoint for s3 storage |
| STORAGE_KEY | root | User for s3 storage |
| STORAGE_SECRET | password | Password for s3 storage |
| STORAGE_AZURECONNECTIONSTRING | "DefaultEndpointsProtocol=..." | Connection string for Azure storage |
Response Format
Expected AMQP input message:
{
"dossierId": "",
"fileId": "",
"targetFileExtension": "",
"responseFileExtension": ""
}
Optionally, the input message can contain a field with the key "operations".
AMQP output message:
{
"dossierId": "",
"fileId": "",
...
}
Development
Either run src/serve.py or the built Docker image.
Setup
Install module.
pip install -e .
pip install -r requirements.txt
or build docker image.
docker build -f Dockerfile -t pyinfra .
Usage
Shell 1: Start a MinIO and a RabbitMQ docker container.
docker-compose up
Shell 2: Add files to the local minio storage.
python scripts/manage_minio.py add <MinIO target folder> -d path/to/a/folder/with/PDFs
Shell 2: Run pyinfra-server.
python src/serve.py
or as container:
docker run --net=host pyinfra
Shell 3: Run analysis-container.
Shell 4: Start a client that sends requests to process PDFs from the MinIO store and annotates these PDFs according to the service responses.
python scripts/mock_client.py
Description
Release 4.1.0
Latest
Languages
Python
96.7%
Makefile
2%
Shell
1.3%