Go to file

Julius Unverfehrt bb00c83a80 Upgrade python version & change logger

- Upgrades python version to 3.10 and sync packages with isaacs list.
- Changes loguru logger to kn_utlis logger.
- Overrides python version in CI script (temporarily until all services
  are updated and CI template can be adjusted).

2023-11-13 15:28:49 +01:00

pyinfra

Upgrade python version & change logger

2023-11-13 15:28:49 +01:00

scripts

Pull request #68 : RED-6273 multi tenant storage

2023-03-28 15:04:14 +02:00

tests

Upgrade python version & change logger

2023-11-13 15:28:49 +01:00

.coveragerc

Pull request #30 : Multiple consumers fix

2022-04-21 17:45:14 +02:00

.gitignore

ignore bamboo YAML configs

2022-11-15 09:00:46 +01:00

.gitlab-ci.yml

Upgrade python version & change logger

2023-11-13 15:28:49 +01:00

.pre-commit-config.yaml

Pull request #57 : Bugfix/RED-5277 investigate missing heartbeat error

2023-02-15 16:02:17 +01:00

.python-version

upgrade dependencies, allow python>=3.8

2023-07-18 16:54:29 +02:00

Makefile

Pull request #57 : Bugfix/RED-5277 investigate missing heartbeat error

2023-02-15 16:02:17 +01:00

poetry.lock

Upgrade python version & change logger

2023-11-13 15:28:49 +01:00

pyproject.toml

Upgrade python version & change logger

2023-11-13 15:28:49 +01:00

README.md

Update kn-utils package

2023-08-30 15:58:29 +02:00

README.md

PyInfra

About
Configuration
Response Format
Usage & API
Scripts
Tests

About

Common Module with the infrastructure to deploy Research Projects. The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.

Configuration

A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.

Environment Variable	Default	Description
LOGGING_LEVEL_ROOT	"DEBUG"	Logging level for service logger
MONITORING_ENABLED	True	Enables Prometheus monitoring
PROMETHEUS_METRIC_PREFIX	"redactmanager_research_service"	Prometheus metric prefix, per convention '{product_name}_{service name}'
PROMETHEUS_HOST	"127.0.0.1"	Prometheus webserver address
PROMETHEUS_PORT	8080	Prometheus webserver port
RABBITMQ_HOST	"localhost"	RabbitMQ host address
RABBITMQ_PORT	"5672"	RabbitMQ host port
RABBITMQ_USERNAME	"user"	RabbitMQ username
RABBITMQ_PASSWORD	"bitnami"	RabbitMQ password
RABBITMQ_HEARTBEAT	60	Controls AMQP heartbeat timeout in seconds
RABBITMQ_CONNECTION_SLEEP	5	Controls AMQP connection sleep timer in seconds
REQUEST_QUEUE	"request_queue"	Requests to service
RESPONSE_QUEUE	"response_queue"	Responses by service
DEAD_LETTER_QUEUE	"dead_letter_queue"	Messages that failed to process
STORAGE_BACKEND	"s3"	The type of storage to use {s3, azure}
STORAGE_BUCKET	"redaction"	The bucket / container to pull files specified in queue requests from
STORAGE_ENDPOINT	"http://127.0.0.1:9000"	Endpoint for s3 storage
STORAGE_KEY	"root"	User for s3 storage
STORAGE_SECRET	"password"	Password for s3 storage
STORAGE_AZURECONNECTIONSTRING	"DefaultEndpointsProtocol=..."	Connection string for Azure storage
STORAGE_AZURECONTAINERNAME	"redaction"	AKS container
WRITE_CONSUMER_TOKEN	"False"	Value to see if we should write a consumer token to a file

Response Format

Expected AMQP input message:

Either use the legacy format with dossierId and fileId as strings or the new format where absolute paths are used. A tenant ID can be optionally provided in the message header (key: "X-TENANT-ID")

{
  "targetFilePath": "",
  "responseFilePath": ""
}

{
   "dossierId": "",
   "fileId": "",
   "targetFileExtension": "",
   "responseFileExtension": ""
}

Optionally, the input message can contain a field with the key "operations".

AMQP output message:

{
  "targetFilePath": "",
  "responseFilePath": ""
}

{
  "dossierId": "",
  "fileId": ""
}

Usage & API

Setup

Add the respective version of the pyinfra package to your pyproject.toml file. Make sure to add our gitlab registry as a source. For now, all internal packages used by pyinfra also have to be added to the pyproject.toml file. Execute poetry lock and poetry install to install the packages.

You can look up the latest version of the package in the gitlab registry. For the used versions of internal dependencies, please refer to the pyproject.toml file.

[tool.poetry.dependencies]
pyinfra = { version = "x.x.x", source = "gitlab-research" }
kn-utils = { version = "x.x.x", source = "gitlab-research" }

[[tool.poetry.source]]
name = "gitlab-research"
url = "https://gitlab.knecon.com/api/v4/groups/19/-/packages/pypi/simple"
priority = "explicit"

API

from pyinfra import config
from pyinfra.payload_processing.processor import make_payload_processor
from pyinfra.queue.queue_manager import QueueManager

pyinfra_config = config.get_config()

process_payload = make_payload_processor(process_data, config=pyinfra_config)

queue_manager = QueueManager(pyinfra_config)
queue_manager.start_consuming(process_payload)

process_data should expect a dict (json) or bytes (pdf) as input and should return a list of results.

Scripts

Run pyinfra locally

Shell 1: Start minio and rabbitmq containers

$ cd tests && docker-compose up

Shell 2: Start pyinfra with callback mock

$ python scripts/start_pyinfra.py

Shell 3: Upload dummy content on storage and publish message

$ python scripts/send_request.py

Tests

Running all tests take a bit longer than you are probably used to, because among other things the required startup times are quite high for docker-compose dependent tests. This is why the tests are split into two parts. The first part contains all tests that do not require docker-compose and the second part contains all tests that require docker-compose. Per default, only the first part is executed, but when releasing a new version, all tests should be executed.

Releases 36

Release 4.1.0 Latest

2025-01-22 12:38:26 +01:00

Languages

Python 96.7%

Makefile 2%

Shell 1.3%