Merge in RR/pyinfra from add_tests to master
Squashed commit of the following:
commit aa99d15c0487304a06bb36d990765fdca28bb651
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 4 13:05:27 2022 +0100
RED-3463: Add some important tests
commit 666bb067ae19ee221ee9a4437f5a2c0bc749b3a6
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 4 09:57:03 2022 +0100
blackkky
commit f6ce3be25e32f0c612b76a88e83b473a4d9ce605
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Mar 4 09:56:25 2022 +0100
added tests for processor and analyzer
commit b08f2742742c52400d9b8c984025c66f25c2bede
Author: cdietrich <clarissa.dietrich@iqser.com>
Date: Fri Mar 4 08:19:55 2022 +0100
add tests
commit 7a37b73db0567d6bbb5d854c73c7bdf57fa34043
Merge: a3056c0 cc0c1bb
Author: cdietrich <clarissa.dietrich@iqser.com>
Date: Thu Mar 3 11:58:03 2022 +0100
Merge branch 'master' of ssh://git.iqser.com:2222/rr/pyinfra
commit a3056c0df73091ee766491331f07e3a0377e7887
Author: cdietrich <clarissa.dietrich@iqser.com>
Date: Thu Mar 3 10:24:54 2022 +0100
refactor save response as file
Infrastructure to deploy Research Projects
The Infrastructure expects to be deployed in the same Pod / local environment as the analysis container and handles all outbound communication.
Configuration
A configuration is located in /config.yaml. All relevant variables can be configured via exporting environment variables.
| Environment Variable | Default | Description |
|---|---|---|
| service | ||
| LOGGING_LEVEL_ROOT | DEBUG | Logging level for service logger |
| RESPONSE_TYPE | "stream" | Whether the analysis response is stored as file on storage or sent as stream: "file" or "stream" |
| RESPONSE_FILE_EXTENSION | ".NER_ENTITIES.json.gz" | Extension to the file that stores the analyized response on storage |
| probing_webserver | ||
| PROBING_WEBSERVER_HOST | "0.0.0.0" | Probe webserver address |
| PROBING_WEBSERVER_PORT | 8080 | Probe webserver port |
| PROBING_WEBSERVER_MODE | production | Webserver mode: {development, production} |
| rabbitmq | ||
| RABBITMQ_HOST | localhost | RabbitMQ host address |
| RABBITMQ_PORT | 5672 | RabbitMQ host port |
| RABBITMQ_USERNAME | user | RabbitMQ username |
| RABBITMQ_PASSWORD | bitnami | RabbitMQ password |
| RABBITMQ_HEARTBEAT | 7200 | Controls AMQP heartbeat timeout in seconds |
| queues | ||
| REQUEST_QUEUE | request_queue | Requests to service |
| RESPONSE_QUEUE | response_queue | Responses by service |
| DEAD_LETTER_QUEUE | dead_letter_queue | Messages that failed to process |
| callback | ||
| RETRY | False | Toggles retry behaviour |
| MAX_ATTEMPTS | 3 | Number of times a message may fail before being published to dead letter queue |
| ANALYSIS_ENDPOINT | "http://127.0.0.1:5000" | |
| storage | ||
| STORAGE_BACKEND | s3 | The type of storage to use {s3, azure} |
| STORAGE_BUCKET | "pyinfra-test-bucket" | The bucket / container to pull files specified in queue requests from |
| TARGET_FILE_EXTENSION | ".TEXT.json.gz" | Defines type of file to pull from storage: .TEXT.json.gz or .ORIGIN.pdf.gz |
| STORAGE_ENDPOINT | "http://127.0.0.1:9000" | |
| STORAGE_KEY | ||
| STORAGE_SECRET | ||
| STORAGE_AZURECONNECTIONSTRING | "DefaultEndpointsProtocol=..." |
Response Format
RESPONSE_AS_FILE == False
Response-Format:
{
"dossierId": "klaus",
"fileId": "1a7fd8ac0da7656a487b68f89188be82",
"imageMetadata": ANALYSIS_DATA
}
Response-example for image-prediction
{
"dossierId": "klaus",
"fileId": "1a7fd8ac0da7656a487b68f89188be82",
"imageMetadata": [
{
"classification": {
"label": "logo",
"probabilities": {
"formula": 0.0,
"logo": 1.0,
"other": 0.0,
"signature": 0.0
}
},
"filters": {
"allPassed": true,
"geometry": {
"imageFormat": {
"quotient": 1.570791527313267,
"tooTall": false,
"tooWide": false
},
"imageSize": {
"quotient": 0.19059804229011604,
"tooLarge": false,
"tooSmall": false
}
},
"probability": {
"unconfident": false
}
},
"geometry": {
"height": 107.63999999999999,
"width": 169.08000000000004
},
"position": {
"pageNumber": 1,
"x1": 213.12,
"x2": 382.20000000000005,
"y1": 568.7604,
"y2": 676.4004
}
}
]
}
RESPONSE_AS_FILE == True
Creates a respone file on the request storage, named dossier_Id / file_Id + RESPONSE_FILE_EXTENSION with the ANALYSIS_DATA as content.
Development
Local Setup
You can run the infrastructure either as module via. src/serve.py or as Dockercontainer simulating the kubernetes environment
-
Install module / build docker image
pip install -e . pip install -r requirements.txtdocker build -f Dockerfile -t pyinfra . -
Run rabbitmq & minio
docker-compose up -
Run module / docker container
python src/serve.pydocker run --net=host pyinfra
Description
Release 4.1.0
Latest
Languages
Python
96.7%
Makefile
2%
Shell
1.3%