Matthias Bisping a00ceae0e5 Pull request #28: Prefetch adjustment
Merge in RR/pyinfra from prefetch_adjustment to master

Squashed commit of the following:

commit 6f9d75bf49ad196bf5728386527499025ac27b3a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Apr 21 13:47:40 2022 +0200

    removed tests without much value that caused teardown problems with docker containers

commit b7ccbe20e3babbf1127ea5738a1d710d8029c90b
Merge: 51a459c 5925737
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Apr 21 13:04:34 2022 +0200

    Merge branch 'sonarscan_refac' into prefetch_adjustment

commit 51a459cbf04e9884cf6b7c2c3145206ecf1a0ffb
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Thu Apr 21 13:03:48 2022 +0200

    set prefetch count to 1; removed obsolete imports

commit 592573793cdfd098012a98cfc7ab0cc1fbfd0e44
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Tue Apr 19 18:01:46 2022 +0200

    refactoring; cleanup
2022-04-21 14:01:49 +02:00
2022-03-22 15:48:27 +01:00
2022-02-17 08:20:59 +01:00
2022-03-15 15:05:14 +01:00
2022-02-16 15:49:38 +01:00

Infrastructure to deploy Research Projects

The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.

Configuration

A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.

Environment Variable Default Description
LOGGING_LEVEL_ROOT DEBUG Logging level for service logger
PROBING_WEBSERVER_HOST "0.0.0.0" Probe webserver address
PROBING_WEBSERVER_PORT 8080 Probe webserver port
PROBING_WEBSERVER_MODE production Webserver mode: {development, production}
RABBITMQ_HOST localhost RabbitMQ host address
RABBITMQ_PORT 5672 RabbitMQ host port
RABBITMQ_USERNAME user RabbitMQ username
RABBITMQ_PASSWORD bitnami RabbitMQ password
RABBITMQ_HEARTBEAT 7200 Controls AMQP heartbeat timeout in seconds
REQUEST_QUEUE request_queue Requests to service
RESPONSE_QUEUE response_queue Responses by service
DEAD_LETTER_QUEUE dead_letter_queue Messages that failed to process
ANALYSIS_ENDPOINT "http://127.0.0.1:5000" Endpoint for analysis container
STORAGE_BACKEND s3 The type of storage to use {s3, azure}
STORAGE_BUCKET "pyinfra-test-bucket" The bucket / container to pull files specified in queue requests from
STORAGE_ENDPOINT "http://127.0.0.1:9000" Endpoint for s3 storage
STORAGE_KEY root User for s3 storage
STORAGE_SECRET password Password for s3 storage
STORAGE_AZURECONNECTIONSTRING "DefaultEndpointsProtocol=..." Connection string for Azure storage

Response Format

Expected AMQP input message:

{
   "dossierId": "",
   "fileId": "",
   "targetFileExtension": "",
   "responseFileExtension": ""
}

Optionally, the input message can contain a field with the key "operations".

AMQP output message:

{
  "dossierId": "",
  "fileId": "",
   ...
}

Development

Either run src/serve.py or the built Docker image.

Setup

Install module.

 pip install -e .
 pip install -r requirements.txt

or build docker image.

docker build -f Dockerfile -t pyinfra .

Usage

Shell 1: Start a MinIO and a RabbitMQ docker container.

docker-compose up

Shell 2: Add files to the local minio storage.

python scripts/manage_minio.py add <MinIO target folder> -d path/to/a/folder/with/PDFs

Shell 2: Run pyinfra-server.

python src/serve.py    

or as container:

docker run --net=host pyinfra

Shell 3: Run analysis-container.

Shell 4: Start a client that sends requests to process PDFs from the MinIO store and annotates these PDFs according to the service responses.

python scripts/mock_client.py
Description
Infrastructure container for analysis container
Readme 3.2 MiB
Release 4.1.0 Latest
2025-01-22 12:38:26 +01:00
Languages
Python 96.7%
Makefile 2%
Shell 1.3%