Merge in RR/pyinfra from prefetch_adjustment to master
Squashed commit of the following:
commit 6f9d75bf49ad196bf5728386527499025ac27b3a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date: Thu Apr 21 13:47:40 2022 +0200
removed tests without much value that caused teardown problems with docker containers
commit b7ccbe20e3babbf1127ea5738a1d710d8029c90b
Merge: 51a459c 5925737
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date: Thu Apr 21 13:04:34 2022 +0200
Merge branch 'sonarscan_refac' into prefetch_adjustment
commit 51a459cbf04e9884cf6b7c2c3145206ecf1a0ffb
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date: Thu Apr 21 13:03:48 2022 +0200
set prefetch count to 1; removed obsolete imports
commit 592573793cdfd098012a98cfc7ab0cc1fbfd0e44
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date: Tue Apr 19 18:01:46 2022 +0200
refactoring; cleanup
Infrastructure to deploy Research Projects
The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.
Configuration
A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.
| Environment Variable | Default | Description |
|---|---|---|
| LOGGING_LEVEL_ROOT | DEBUG | Logging level for service logger |
| PROBING_WEBSERVER_HOST | "0.0.0.0" | Probe webserver address |
| PROBING_WEBSERVER_PORT | 8080 | Probe webserver port |
| PROBING_WEBSERVER_MODE | production | Webserver mode: {development, production} |
| RABBITMQ_HOST | localhost | RabbitMQ host address |
| RABBITMQ_PORT | 5672 | RabbitMQ host port |
| RABBITMQ_USERNAME | user | RabbitMQ username |
| RABBITMQ_PASSWORD | bitnami | RabbitMQ password |
| RABBITMQ_HEARTBEAT | 7200 | Controls AMQP heartbeat timeout in seconds |
| REQUEST_QUEUE | request_queue | Requests to service |
| RESPONSE_QUEUE | response_queue | Responses by service |
| DEAD_LETTER_QUEUE | dead_letter_queue | Messages that failed to process |
| ANALYSIS_ENDPOINT | "http://127.0.0.1:5000" | Endpoint for analysis container |
| STORAGE_BACKEND | s3 | The type of storage to use {s3, azure} |
| STORAGE_BUCKET | "pyinfra-test-bucket" | The bucket / container to pull files specified in queue requests from |
| STORAGE_ENDPOINT | "http://127.0.0.1:9000" | Endpoint for s3 storage |
| STORAGE_KEY | root | User for s3 storage |
| STORAGE_SECRET | password | Password for s3 storage |
| STORAGE_AZURECONNECTIONSTRING | "DefaultEndpointsProtocol=..." | Connection string for Azure storage |
Response Format
Expected AMQP input message:
{
"dossierId": "",
"fileId": "",
"targetFileExtension": "",
"responseFileExtension": ""
}
Optionally, the input message can contain a field with the key "operations".
AMQP output message:
{
"dossierId": "",
"fileId": "",
...
}
Development
Either run src/serve.py or the built Docker image.
Setup
Install module.
pip install -e .
pip install -r requirements.txt
or build docker image.
docker build -f Dockerfile -t pyinfra .
Usage
Shell 1: Start a MinIO and a RabbitMQ docker container.
docker-compose up
Shell 2: Add files to the local minio storage.
python scripts/manage_minio.py add <MinIO target folder> -d path/to/a/folder/with/PDFs
Shell 2: Run pyinfra-server.
python src/serve.py
or as container:
docker run --net=host pyinfra
Shell 3: Run analysis-container.
Shell 4: Start a client that sends requests to process PDFs from the MinIO store and annotates these PDFs according to the service responses.
python scripts/mock_client.py
Description
Release 4.1.0
Latest
Languages
Python
96.7%
Makefile
2%
Shell
1.3%