Merge in RR/pyinfra from RED-4564 to master
Squashed commit of the following:
commit bf4c85a0ab9fed19a44508f2cbef6858cbb32259
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 8 15:46:11 2022 +0200
RED-4564: POC-test to see if cancelling the consumer prevents messages from being stuck
commit 12ebd186b220f263ac2275463b0c124e8f4210fc
Author: Viktor Seifert <viktor.seifert@iqser.com>
Date: Fri Jul 8 14:32:05 2022 +0200
RED-4564: Print full exception with traceback when processing from the queue
Pull request #37: RED-4124 PyInfra now tries three times to download an unobtainable object before republishing message, removed unreadable stack trace print if message is put on dead letter queue since allready logged where the exeption is raised.
Infrastructure to deploy Research Projects
The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.
Configuration
A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.
| Environment Variable | Default | Description |
|---|---|---|
| LOGGING_LEVEL_ROOT | DEBUG | Logging level for service logger |
| PROBING_WEBSERVER_HOST | "0.0.0.0" | Probe webserver address |
| PROBING_WEBSERVER_PORT | 8080 | Probe webserver port |
| PROBING_WEBSERVER_MODE | production | Webserver mode: {development, production} |
| RABBITMQ_HOST | localhost | RabbitMQ host address |
| RABBITMQ_PORT | 5672 | RabbitMQ host port |
| RABBITMQ_USERNAME | user | RabbitMQ username |
| RABBITMQ_PASSWORD | bitnami | RabbitMQ password |
| RABBITMQ_HEARTBEAT | 7200 | Controls AMQP heartbeat timeout in seconds |
| REQUEST_QUEUE | request_queue | Requests to service |
| RESPONSE_QUEUE | response_queue | Responses by service |
| DEAD_LETTER_QUEUE | dead_letter_queue | Messages that failed to process |
| ANALYSIS_ENDPOINT | "http://127.0.0.1:5000" | Endpoint for analysis container |
| STORAGE_BACKEND | s3 | The type of storage to use {s3, azure} |
| STORAGE_BUCKET | "pyinfra-test-bucket" | The bucket / container to pull files specified in queue requests from |
| STORAGE_ENDPOINT | "http://127.0.0.1:9000" | Endpoint for s3 storage |
| STORAGE_KEY | root | User for s3 storage |
| STORAGE_SECRET | password | Password for s3 storage |
| STORAGE_AZURECONNECTIONSTRING | "DefaultEndpointsProtocol=..." | Connection string for Azure storage |
Response Format
Expected AMQP input message:
{
"dossierId": "",
"fileId": "",
"targetFileExtension": "",
"responseFileExtension": ""
}
Optionally, the input message can contain a field with the key "operations".
AMQP output message:
{
"dossierId": "",
"fileId": "",
...
}
Development
Either run src/serve.py or the built Docker image.
Setup
Install module.
pip install -e .
pip install -r requirements.txt
or build docker image.
docker build -f Dockerfile -t pyinfra .
Usage
Shell 1: Start a MinIO and a RabbitMQ docker container.
docker-compose up
Shell 2: Add files to the local minio storage.
python scripts/manage_minio.py add <MinIO target folder> -d path/to/a/folder/with/PDFs
Shell 2: Run pyinfra-server.
python src/serve.py
or as container:
docker run --net=host pyinfra
Shell 3: Run analysis-container.
Shell 4: Start a client that sends requests to process PDFs from the MinIO store and annotates these PDFs according to the service responses.
python scripts/mock_client.py
Description
Release 4.1.0
Latest
Languages
Python
96.7%
Makefile
2%
Shell
1.3%