Compare commits

...

129 Commits

Author SHA1 Message Date
Matthias Bisping
ba901473fe Set alpha in box frame drawing logic 2023-02-01 13:12:53 +01:00
Matthias Bisping
e8b4467265 Remove unused code 2023-02-01 13:12:20 +01:00
Matthias Bisping
4c65d906b8 Add fixme 2023-02-01 11:53:35 +01:00
Matthias Bisping
667b4a4858 Refactoring and text cell content tweaking 2023-02-01 11:32:37 +01:00
Matthias Bisping
83e6dc3ce7 Add IPython dev dependency 2023-02-01 11:32:12 +01:00
Matthias Bisping
fb69eb7f5c Refactoring
Break up conditional tree in cell building function
2023-02-01 10:09:32 +01:00
Matthias Bisping
f98256d7e9 Fix bug in table generation
- Remove the check `elif size < Size.LARGE.value` and made it into `else`,
since it was intended to cover all cells that are larger than medium
size.

- Also disbale page number generation for now
2023-02-01 09:58:34 +01:00
Matthias Bisping
cbb3a8cc61 [WIP] Page numbers 2023-01-31 17:09:59 +01:00
Matthias Bisping
9f9face8f0 Make scatterplots more variable 2023-01-31 16:33:48 +01:00
Matthias Bisping
f2af040c5b [WIP] texture and content blendig with blend_modes module 2023-01-31 16:05:08 +01:00
Matthias Bisping
6dbe3b6fc9 Refactoring 2023-01-31 14:37:46 +01:00
Matthias Bisping
a3fece8096 Found first issue for pale colors 2023-01-31 14:16:33 +01:00
Matthias Bisping
26180373a0 Remove unused imports 2023-01-31 13:55:13 +01:00
Matthias Bisping
186b4530f0 [WIP] Make texture show through page content 2023-01-31 13:53:52 +01:00
Matthias Bisping
a1ccda4ea9 Refactoring 2023-01-30 14:11:48 +01:00
Matthias Bisping
25d35e2349 Fix bug in booktabs code 2023-01-25 20:55:19 +01:00
Matthias Bisping
daea7d2bf7 [WIP] head and bottom border (booktabs-like) for tables 2023-01-25 20:18:09 +01:00
Matthias Bisping
d5e501a05d Tweak plots 2023-01-25 19:44:54 +01:00
Matthias Bisping
d9d363834a Tweak plots and table cells
- Choice of plot depends on aspect ratio of rectanlge now and is handled
in the plot constructor

- Made pie charts more diverse

- Table cell background is no complementary color chosen against
  colormap
2023-01-25 19:13:43 +01:00
Matthias Bisping
5dc13e7137 [WIP] More table / cell edge fiddling and issue fixing
Fix: The cell width and height were rounded to int in the table
constructor. The imprecison of rounding would accumulate when stacking
cells in a row or columns leading to gaps at the bottom and right hand
edge of tables.
The rounding has now been removed and left to the cell constructor.
Cells are derived from the Rectangle class, which does the rounding
itself. This eliminates the issue with accumulated gaps in the tables.
2023-01-25 18:16:36 +01:00
Matthias Bisping
826cd3b6a9 [WIP] More table / cell edge fiddling and issue fixing 2023-01-25 17:23:30 +01:00
Matthias Bisping
4f788af35b [WIP] More table / cell edge fiddling and issue fixing
Cells now draw only inner borders and the table draws the outer border
if the layout is "closed". This avoids multiple lines around cells of
nested tables, since nested tables are now created with the layout
parameter set to "open", in which case the table does not draw its
borders.
2023-01-25 10:31:17 +01:00
Matthias Bisping
10ea584143 [WIP] More table / cell edge fiddling and issue fixing 2023-01-24 15:44:24 +01:00
Matthias Bisping
7676a8148e [WIP] More table / cell edge fiddling and issue fixing 2023-01-24 13:53:59 +01:00
Matthias Bisping
cee5e69a4b Make page generation reproducable
Tie all structural random events to a seeded random object.
2023-01-24 13:07:45 +01:00
Matthias Bisping
e715c86f8d Fix box clashes
Rewrote box generation sequence and eliminated issue with gaps /
overlapping boxes
2023-01-24 12:07:47 +01:00
Matthias Bisping
c5ba489931 Refactoring 2023-01-24 10:49:38 +01:00
Matthias Bisping
3772ca021a Refactoring
Pull base class for page partioner out of page partioner and add random
page partioner derived class.
2023-01-24 10:13:03 +01:00
Matthias Bisping
c4eeb956ca Fix incorrect font size kwarg 2023-01-24 10:12:44 +01:00
Matthias Bisping
d823ebf7c6 Refactoring
Refactor page partitioner
2023-01-24 10:00:35 +01:00
Matthias Bisping
71ffb28381 Re-add actual block generator calls back 2023-01-23 15:38:20 +01:00
Matthias Bisping
9dfbe9a142 Add different basic table layouts 2023-01-23 15:30:12 +01:00
Matthias Bisping
0eb57056ba Fix / improve table cell border drawing 2023-01-23 15:16:26 +01:00
Matthias Bisping
70802d6341 Fix error in image padding logic 2023-01-23 14:38:42 +01:00
Matthias Bisping
52776494cb Tweak font selection 2023-01-23 14:15:41 +01:00
Matthias Bisping
7d8842b4ac Refactoring & Add table captioons 2023-01-23 13:14:48 +01:00
Matthias Bisping
9e77e25afb Refactoring
Move text block generation code into its own class.
2023-01-23 12:22:42 +01:00
Matthias Bisping
b3480491be Refactoring
Move line formatting code into its own class.
2023-01-23 12:13:22 +01:00
Matthias Bisping
3d0c2396ee Remove obsolete code 2023-01-23 12:05:01 +01:00
Matthias Bisping
f8c2d691b2 [WIP] Figure captions 2023-01-23 12:04:27 +01:00
Matthias Bisping
ced1cd9559 More tables less text 2023-01-18 19:51:13 +01:00
Matthias Bisping
738c51a337 Merge branch 'refactoring' of git+ssh://git.iqser.com:2222/rr/cv-analysis into refactoring 2023-01-18 19:25:24 +01:00
Matthias Bisping
48f6aebc13 Tweaking 2023-01-18 19:22:48 +01:00
Matthias Bisping
73d546367c Tweaking 2023-01-18 18:53:56 +01:00
Matthias Bisping
cfe4b58e38 Add option to put specific text into text block 2023-01-18 17:19:12 +01:00
Matthias Bisping
839a264816 Add option to put specific text into text block 2023-01-18 17:18:56 +01:00
Matthias Bisping
fd57fe99b7 Tweak content selecton logic 2023-01-18 16:56:25 +01:00
Matthias Bisping
5e51fd1d10 Tweak content selecton logic 2023-01-18 16:17:20 +01:00
Matthias Bisping
9c7c5e315f Select cell content conditioned on cell size class 2023-01-18 15:54:24 +01:00
Matthias Bisping
3da613af94 [WIP] recursive random table: Add recursive construction 2023-01-18 15:04:04 +01:00
Matthias Bisping
30e6350881 [WIP] recursive random table
Add padding between cell content and cell border
2023-01-18 14:50:15 +01:00
Matthias Bisping
384f0e5f28 [WIP] recursive random table: tweak cell broder and fill logic 2023-01-18 13:42:38 +01:00
Matthias Bisping
4d181448b6 [WIP] recursive random table: basic version working 2023-01-18 13:30:19 +01:00
Matthias Bisping
a5cd3d6ec9 [WIP] recursive random table 2023-01-18 13:11:15 +01:00
Matthias Bisping
893622a73e [WIP] recursive random table 2023-01-18 11:45:19 +01:00
Matthias Bisping
4d11a157e5 Cache font selection 2023-01-18 09:39:04 +01:00
Matthias Bisping
4c10d521e2 [WIP] random font selection 2023-01-17 14:58:54 +01:00
Matthias Bisping
0f6cbec1d5 Refactoring 2023-01-17 13:43:12 +01:00
Matthias Bisping
54484d9ad0 [WIP] random table segments: Table via tabulate and text -> image 2023-01-17 13:23:53 +01:00
Matthias Bisping
ca190721d6 [WIP] random table segments & refactoring 2023-01-17 13:17:33 +01:00
Matthias Bisping
5611314ff3 [WIP] random table segments 2023-01-17 11:42:11 +01:00
Matthias Bisping
4ecfe16df5 Constrain possible random layouts 2023-01-17 11:12:25 +01:00
Matthias Bisping
38c0614396 Assign box type by box aspect ratio 2023-01-17 10:59:53 +01:00
Matthias Bisping
64565f9cb0 Complete first iteraton of random plot generation 2023-01-17 10:55:09 +01:00
Matthias Bisping
232c6bed4b Refactoring: Rename 2023-01-17 09:54:50 +01:00
Matthias Bisping
8d34873d1c [WIP] random plot segments 2023-01-16 19:33:46 +01:00
Matthias Bisping
78a951a319 [WIP] random plot segments 2023-01-16 18:42:34 +01:00
Matthias Bisping
8d57d2043d [WIP] random text segments 2023-01-16 18:18:22 +01:00
Matthias Bisping
41fdda4955 [WIP] random text segments 2023-01-16 17:55:20 +01:00
Matthias Bisping
4dfdd579a2 [WIP] random text segments 2023-01-16 17:41:30 +01:00
Matthias Bisping
e831ab1382 [WIP] random text segments 2023-01-16 17:17:50 +01:00
Matthias Bisping
6fead2d9b9 [WIP] random text segments 2023-01-16 16:34:18 +01:00
Matthias Bisping
1012988475 Remove obsolete code 2023-01-16 13:35:59 +01:00
Matthias Bisping
5bc1550eae Complete page partitioning into empty boxes
Completed logic for partitioning page into content boxes. Next step is
to fill content boxes with random content.
2023-01-16 13:32:38 +01:00
Matthias Bisping
29741fc5da [WIP] random content box generation 2023-01-16 12:07:56 +01:00
Matthias Bisping
4772e3037c Remove obsolete code 2023-01-16 11:16:27 +01:00
Matthias Bisping
dd6ab94aa2 [WIP] Replace texture generation with loadig textures from files 2023-01-16 10:59:13 +01:00
Matthias Bisping
eaca8725de Balance colors of base textures
Make base textures more similar in color balance
2023-01-16 10:19:05 +01:00
Matthias Bisping
4af202f098 Add base paper textures 2023-01-16 10:09:34 +01:00
Matthias Bisping
1199845cdf Refactoring: Rename 2023-01-16 08:47:45 +01:00
Matthias Bisping
4578413748 Improve page texture logic 2023-01-11 14:05:39 +01:00
Matthias Bisping
d5d67cb064 Fix image format (RGB/A, float/uint8, [0, 1/255]) issues 2023-01-11 12:17:07 +01:00
Matthias Bisping
d8542762e6 [WIP] Add augmentation pipeline to page generation 2023-01-10 17:13:39 +01:00
Matthias Bisping
caef416077 Tweak page generation 2023-01-10 16:37:35 +01:00
Matthias Bisping
a8708ffc56 [WIP] page generation for tests 2023-01-10 16:31:02 +01:00
Matthias Bisping
3f0bbf0fc7 Refactoring 2023-01-10 11:59:01 +01:00
Matthias Bisping
2fec39eda6 Add docstring 2023-01-10 11:31:13 +01:00
Matthias Bisping
16cc0007ed Refactoring 2023-01-10 11:30:36 +01:00
Matthias Bisping
3d83489819 Refactoring: Make single pass rectangle merging stateless 2023-01-10 11:14:15 +01:00
Matthias Bisping
3134021596 Add typehints 2023-01-10 10:20:07 +01:00
Matthias Bisping
3cb857d830 Refactoring: Move 2023-01-10 10:19:49 +01:00
Matthias Bisping
194102939e Refactoring
- add typehints
- other minor refactorings
2023-01-10 10:10:08 +01:00
Matthias Bisping
5d1d9516b5 Add fixmes and format docstring 2023-01-10 09:39:23 +01:00
Matthias Bisping
77f85e9de1 Refactoring
Various
2023-01-09 17:22:01 +01:00
Matthias Bisping
c00081b2bc Refactoring: Move 2023-01-09 17:01:36 +01:00
Matthias Bisping
619f67f1fd Refactoring
Various
2023-01-09 16:51:58 +01:00
Matthias Bisping
a97f8def7c Refactor metrics 2023-01-09 16:22:52 +01:00
Matthias Bisping
65e9735bd9 Refactor metrics 2023-01-09 15:53:53 +01:00
Matthias Bisping
689be75478 Refactoring 2023-01-09 15:47:12 +01:00
Matthias Bisping
acf46a7a48 [WIP] Refactoring meta-detection 2023-01-09 15:40:32 +01:00
Matthias Bisping
0f11441b20 [WIP] Refactoring meta-detection 2023-01-09 15:32:51 +01:00
Matthias Bisping
fa1fa15cc8 [WIP] Refactoring meta-detection 2023-01-09 15:05:00 +01:00
Matthias Bisping
17c40c996a [WIP] Refactoring meta-detection 2023-01-09 14:44:22 +01:00
Matthias Bisping
99af2943b5 [WIP] Refactoring meta-detection 2023-01-09 14:33:27 +01:00
Matthias Bisping
0e6cb495e8 [WIP] Refactoring meta-detection 2023-01-09 14:29:22 +01:00
Matthias Bisping
012e705e70 [WIP] Refactoring meta-detection 2023-01-09 14:22:18 +01:00
Matthias Bisping
8327794685 [WIP] Refactoring meta-detection 2023-01-09 13:42:23 +01:00
Matthias Bisping
72bc52dc7b [WIP] Refactoring meta-detection 2023-01-09 13:27:26 +01:00
Matthias Bisping
557d091a54 [WIP] Refactoring meta-detection 2023-01-09 12:03:50 +01:00
Matthias Bisping
b540cfd0f2 [WIP] Refactoring meta-detection 2023-01-09 11:38:55 +01:00
Matthias Bisping
8824c5c3ea Refactoring 2023-01-09 11:33:38 +01:00
Matthias Bisping
94e9210faf Refactoring
Various
2023-01-09 11:21:43 +01:00
Matthias Bisping
06d6863cc5 Format docstrings 2023-01-04 18:50:27 +01:00
Matthias Bisping
dfd87cb4b0 Refactoring 2023-01-04 18:29:52 +01:00
Matthias Bisping
cd5457840b Refactoring
Various
2023-01-04 18:13:54 +01:00
Matthias Bisping
eee2f0e256 Refactoring
Rename module
2023-01-04 17:40:43 +01:00
Matthias Bisping
9d2f166fbf Refactoring
Various
2023-01-04 17:36:06 +01:00
Matthias Bisping
97fb4b645d Refactoring
Remove more code that is not adhering to separation of concerns from Rectangle class
2023-01-04 16:49:44 +01:00
Matthias Bisping
00e53fb54d Refactoring
Remove code that is not adhering to separation of concerns from Rectangle class
2023-01-04 16:29:43 +01:00
Matthias Bisping
4be91de036 Refactoring
Further clean up Rectangle class
2023-01-04 15:26:39 +01:00
Matthias Bisping
8c6b940364 Refactoring
Clean up Rectangle class
2023-01-04 14:57:47 +01:00
Matthias Bisping
cdb12baccd Format docstrings 2023-01-04 13:57:51 +01:00
Matthias Bisping
ac84494613 Refactoring 2023-01-04 13:32:57 +01:00
Matthias Bisping
77f565c652 Fix
Fix a typehint
Fix a bug that would happen when a generator is passed
2023-01-04 12:06:28 +01:00
Matthias Bisping
47e657aaa3 Refactoring
Clean up and prove correctness of intersection computation
2023-01-04 12:05:57 +01:00
Matthias Bisping
b592497b75 Refactoring 2023-01-04 10:58:24 +01:00
Matthias Bisping
c0d961bc39 Merge branch 'poetrify' into refactoring 2023-01-04 10:12:50 +01:00
Matthias Bisping
8260ae58f9 Refactoring
Make adjacency checking code clean
2023-01-04 10:11:46 +01:00
Matthias Bisping
068f75d35b Apply black 2023-01-04 10:11:28 +01:00
46 changed files with 3696 additions and 580 deletions

View File

@ -1,17 +1,17 @@
from functools import partial
import cv2
import numpy as np
from funcy import lmap
from cv_analysis.figure_detection.figures import detect_large_coherent_structures
from cv_analysis.figure_detection.text import remove_primary_text_regions
from cv_analysis.utils.conversion import contour_to_rectangle
from cv_analysis.utils.filters import (
is_large_enough,
has_acceptable_format,
is_not_too_large,
is_small_enough,
)
from cv_analysis.utils.postprocessing import remove_included
from cv_analysis.utils.structures import Rectangle
def detect_figures(image: np.array):
@ -21,19 +21,18 @@ def detect_figures(image: np.array):
figure_filter = partial(is_likely_figure, min_area, max_area, max_width_to_height_ratio)
image = remove_primary_text_regions(image)
cnts = detect_large_coherent_structures(image)
cnts = filter(figure_filter, cnts)
contours = detect_large_coherent_structures(image)
contours = filter(figure_filter, contours)
rects = map(cv2.boundingRect, cnts)
rects = map(Rectangle.from_xywh, rects)
rects = remove_included(rects)
rectangles = lmap(contour_to_rectangle, contours)
rectangles = remove_included(rectangles)
return rects
return rectangles
def is_likely_figure(min_area, max_area, max_width_to_height_ratio, cnts):
def is_likely_figure(min_area, max_area, max_width_to_height_ratio, contours):
return (
is_not_too_large(cnts, max_area)
and is_large_enough(cnts, min_area)
and has_acceptable_format(cnts, max_width_to_height_ratio)
is_small_enough(contours, max_area)
and is_large_enough(contours, min_area)
and has_acceptable_format(contours, max_width_to_height_ratio)
)

View File

@ -1,25 +1,33 @@
import cv2
import numpy as np
from cv_analysis.utils.common import find_contours_and_hierarchies
def detect_large_coherent_structures(image: np.array):
"""Detects large coherent structures on an image.
"""Detects large coherent structures in an image.
Expects an image with binary color space (e.g. threshold applied).
Args:
image (np.array): Image to look for large coherent structures in.
Returns:
contours
list: List of contours.
References:
https://stackoverflow.com/questions/60259169/how-to-group-nearby-contours-in-opencv-python-zebra-crossing-detection
"""
assert len(image.shape) == 2
# FIXME: Parameterize via factory
dilate_kernel = cv2.getStructuringElement(cv2.MORPH_OPEN, (5, 5))
# FIXME: Parameterize via factory
dilate = cv2.dilate(image, dilate_kernel, iterations=4)
# FIXME: Parameterize via factory
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 20))
close = cv2.morphologyEx(dilate, cv2.MORPH_CLOSE, close_kernel, iterations=1)
# FIXME: Parameterize via factory
close = cv2.morphologyEx(dilate, cv2.MORPH_CLOSE, close_kernel, iterations=1) # TODO: Tweak iterations
cnts, _ = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours, _ = find_contours_and_hierarchies(close)
return cnts
return contours

View File

@ -1,5 +1,7 @@
import cv2
from cv_analysis.utils.common import normalize_to_gray_scale
def remove_primary_text_regions(image):
"""Removes regions of primary text, meaning no figure descriptions for example, but main text body paragraphs.
@ -35,6 +37,7 @@ def remove_primary_text_regions(image):
def apply_threshold_to_image(image):
"""Converts an image to black and white."""
image = normalize_to_gray_scale(image)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if len(image.shape) > 2 else image
return cv2.threshold(image, 253, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

View File

@ -1,87 +1,80 @@
import itertools
from itertools import compress
from itertools import starmap
from operator import __and__
from functools import partial
from typing import Iterable, List
import cv2
import numpy as np
from funcy import compose, rcompose, lkeep
from cv_analysis.utils.connect_rects import connect_related_rects2
from cv_analysis.utils.structures import Rectangle
from cv_analysis.utils.postprocessing import (
remove_overlapping,
remove_included,
has_no_parent,
from cv_analysis.utils import lstarkeep
from cv_analysis.utils.common import (
find_contours_and_hierarchies,
dilate_page_components,
normalize_to_gray_scale,
threshold_image,
invert_image,
fill_rectangles,
)
from cv_analysis.utils.visual_logging import vizlogger
#could be dynamic parameter is the scan is noisy
def is_likely_segment(rect, min_area=100):
return cv2.contourArea(rect, False) > min_area
from cv_analysis.utils.conversion import contour_to_rectangle
from cv_analysis.utils.merging import merge_related_rectangles
from cv_analysis.utils.postprocessing import remove_included, has_no_parent
from cv_analysis.utils.rectangle import Rectangle
def find_segments(image):
contours, hierarchies = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
mask1 = map(is_likely_segment, contours)
mask2 = map(has_no_parent, hierarchies[0])
mask = starmap(__and__, zip(mask1, mask2))
contours = compress(contours, mask)
def parse_layout(image: np.array) -> List[Rectangle]:
"""Parse the layout of a page.
rectangles = (cv2.boundingRect(c) for c in contours)
Args:
image: Image of the page.
Returns:
List of rectangles representing the layout of the page as identified page elements.
"""
rectangles = rcompose(
find_segments,
remove_included,
merge_related_rectangles,
remove_included,
)(image)
return rectangles
def dilate_page_components(image):
#if text is detected in words make kernel bigger
image = cv2.GaussianBlur(image, (7, 7), 0)
thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
return cv2.dilate(thresh, kernel, iterations=4)
def find_segments(image: np.ndarray) -> List[Rectangle]:
"""Find segments in a page. Segments are structural elements of a page, such as text blocks, tables, etc."""
rectangles = rcompose(
prepare_for_initial_detection,
__find_segments,
partial(prepare_for_meta_detection, image.copy()),
__find_segments,
)(image)
return rectangles
def fill_in_component_area(image, rect):
x, y, w, h = rect
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 0), -1)
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 255), 7)
_, image = cv2.threshold(image, 254, 255, cv2.THRESH_BINARY)
return ~image
def prepare_for_initial_detection(image: np.ndarray) -> np.ndarray:
return compose(dilate_page_components, normalize_to_gray_scale)(image)
def __find_segments(image: np.ndarray) -> List[Rectangle]:
def to_rectangle_if_valid(contour, hierarchy):
return contour_to_rectangle(contour) if is_likely_segment(contour) and has_no_parent(hierarchy) else None
def parse_layout(image: np.array):
image = image.copy()
image_ = image.copy()
rectangles = lstarkeep(to_rectangle_if_valid, zip(*find_contours_and_hierarchies(image)))
if len(image_.shape) > 2:
image_ = cv2.cvtColor(image_, cv2.COLOR_BGR2GRAY)
return rectangles
dilate = dilate_page_components(image_)
# show_mpl(dilate)
rects = list(find_segments(dilate))
def prepare_for_meta_detection(image: np.ndarray, rectangles: Iterable[Rectangle]) -> np.ndarray:
image = rcompose(
fill_rectangles,
threshold_image,
invert_image,
normalize_to_gray_scale,
)(image, rectangles)
# -> Run meta detection on the previous detections TODO: refactor
for rect in rects:
x, y, w, h = rect
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 0), -1)
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 255), 7)
# show_mpl(image)
_, image = cv2.threshold(image, 254, 255, cv2.THRESH_BINARY)
image = ~image
# show_mpl(image)
if len(image.shape) > 2:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
return image
rects = find_segments(image)
# <- End of meta detection
rects = list(map(Rectangle.from_xywh, rects))
rects = remove_included(rects)
rects = map(lambda r: r.xywh(), rects)
rects = connect_related_rects2(rects)
rects = list(map(Rectangle.from_xywh, rects))
rects = remove_included(rects)
return rects
def is_likely_segment(rectangle: Rectangle, min_area: float = 100) -> bool:
# FIXME: Parameterize via factory
return cv2.contourArea(rectangle, False) > min_area

View File

@ -5,5 +5,8 @@ from pathlib import Path
MODULE_PATH = Path(__file__).resolve().parents[0]
PACKAGE_ROOT_PATH = MODULE_PATH.parents[0]
REPO_ROOT_PATH = PACKAGE_ROOT_PATH
TEST_DIR_PATH = REPO_ROOT_PATH / "test"
TEST_DATA_DVC = TEST_DIR_PATH / "test_data.dvc"
TEST_DATA_DVC = TEST_DIR_PATH / "test_data.dvc" # TODO: remove once new tests are in place
TEST_DATA_DIR = TEST_DIR_PATH / "data"
TEST_PAGE_TEXTURES_DIR = TEST_DATA_DIR / "paper"

View File

@ -5,7 +5,7 @@ import numpy as np
from iteration_utilities import starfilter, first
from cv_analysis.utils.filters import is_large_enough, is_filled, is_boxy
from cv_analysis.utils.visual_logging import vizlogger
from cv_analysis.utils.visual_logger import vizlogger
def is_likely_redaction(contour, hierarchy, min_area):

View File

@ -5,34 +5,29 @@ from funcy import lmap, flatten
from cv_analysis.figure_detection.figure_detection import detect_figures
from cv_analysis.table_parsing import parse_tables
from cv_analysis.utils.structures import Rectangle
from cv_analysis.utils.rectangle import Rectangle
from pdf2img.conversion import convert_pages_to_images
from pdf2img.default_objects.image import ImagePlus, ImageInfo
from pdf2img.default_objects.rectangle import RectanglePlus
def get_analysis_pipeline(operation, table_parsing_skip_pages_without_images):
if operation == "table":
return make_analysis_pipeline(
parse_tables,
table_parsing_formatter,
dpi=200,
skip_pages_without_images=table_parsing_skip_pages_without_images,
)
elif operation == "figure":
return make_analysis_pipeline(detect_figures, figure_detection_formatter, dpi=200)
def make_analysis_pipeline_for_element_type(segment_type, **kwargs):
if segment_type == "table":
return make_analysis_pipeline(parse_tables, table_parsing_formatter, dpi=200, **kwargs)
elif segment_type == "figure":
return make_analysis_pipeline(detect_figures, figure_detection_formatter, dpi=200, **kwargs)
else:
raise
raise ValueError(f"Unknown segment type {segment_type}.")
def make_analysis_pipeline(analysis_fn, formatter, dpi, skip_pages_without_images=False):
def analyse_pipeline(pdf: bytes, index=None):
def analysis_pipeline(pdf: bytes, index=None):
def parse_page(page: ImagePlus):
image = page.asarray()
rects = analysis_fn(image)
if not rects:
rectangles = analysis_fn(image)
if not rectangles:
return
infos = formatter(rects, page, dpi)
infos = formatter(rectangles, page, dpi)
return infos
pages = convert_pages_to_images(pdf, index=index, dpi=dpi, skip_pages_without_images=skip_pages_without_images)
@ -40,22 +35,26 @@ def make_analysis_pipeline(analysis_fn, formatter, dpi, skip_pages_without_image
yield from flatten(filter(truth, results))
return analyse_pipeline
return analysis_pipeline
def table_parsing_formatter(rects, page: ImagePlus, dpi):
def format_rect(rect: Rectangle):
rect_plus = RectanglePlus.from_pixels(*rect.xyxy(), page.info, alpha=False, dpi=dpi)
return rect_plus.asdict(derotate=True)
def table_parsing_formatter(rectangles, page: ImagePlus, dpi):
def format_rectangle(rectangle: Rectangle):
rectangle_plus = RectanglePlus.from_pixels(*rectangle_to_xyxy(rectangle), page.info, alpha=False, dpi=dpi)
return rectangle_plus.asdict(derotate=True)
bboxes = lmap(format_rect, rects)
bboxes = lmap(format_rectangle, rectangles)
return {"pageInfo": page.asdict(natural_index=True), "tableCells": bboxes}
def figure_detection_formatter(rects, page, dpi):
def format_rect(rect: Rectangle):
rect_plus = RectanglePlus.from_pixels(*rect.xyxy(), page.info, alpha=False, dpi=dpi)
def figure_detection_formatter(rectangles, page, dpi):
def format_rectangle(rectangle: Rectangle):
rect_plus = RectanglePlus.from_pixels(*rectangle_to_xyxy(rectangle), page.info, alpha=False, dpi=dpi)
return asdict(ImageInfo(page.info, rect_plus.asbbox(derotate=False), rect_plus.alpha))
return lmap(format_rect, rects)
return lmap(format_rectangle, rectangles)
def rectangle_to_xyxy(rectangle: Rectangle):
return rectangle.x1, rectangle.y1, rectangle.x2, rectangle.y2

View File

@ -1,15 +1,11 @@
from functools import partial
from itertools import chain, starmap
from operator import attrgetter
import cv2
import numpy as np
from funcy import lmap, lfilter
from cv_analysis.layout_parsing import parse_layout
from cv_analysis.utils.postprocessing import remove_isolated # xywh_to_vecs, xywh_to_vec_rect, adjacent1d
from cv_analysis.utils.structures import Rectangle
from cv_analysis.utils.visual_logging import vizlogger
from cv_analysis.utils.conversion import box_to_rectangle
from cv_analysis.utils.postprocessing import remove_isolated
from cv_analysis.utils.visual_logger import vizlogger
def add_external_contours(image, image_h_w_lines_only):
@ -31,8 +27,7 @@ def apply_motion_blur(image: np.array, angle, size=80):
size (int): kernel size; 80 found empirically to work well
Returns:
np.array
np.ndarray
"""
k = np.zeros((size, size), dtype=np.float32)
vizlogger.debug(k, "tables08_blur_kernel1.png")
@ -55,10 +50,9 @@ def isolate_vertical_and_horizontal_components(img_bin):
Args:
img_bin (np.array): array corresponding to single binarized page image
bounding_rects (list): list of layout boxes of the form (x, y, w, h), potentially containing tables
Returns:
np.array
np.ndarray
"""
line_min_width = 48
kernel_h = np.ones((1, line_min_width), np.uint8)
@ -90,10 +84,9 @@ def find_table_layout_boxes(image: np.array):
def is_large_enough(box):
(x, y, w, h) = box
if w * h >= 100000:
return Rectangle.from_xywh(box)
return box_to_rectangle(box)
layout_boxes = parse_layout(image)
a = lmap(is_large_enough, layout_boxes)
return lmap(is_large_enough, layout_boxes)
@ -103,7 +96,7 @@ def preprocess(image: np.array):
return ~image
def turn_connected_components_into_rects(image: np.array):
def turn_connected_components_into_rectangles(image: np.array):
def is_large_enough(stat):
x1, y1, w, h, area = stat
return area > 2000 and w > 35 and h > 25
@ -117,7 +110,7 @@ def turn_connected_components_into_rects(image: np.array):
return []
def parse_tables(image: np.array, show=False):
def parse_tables(image: np.array):
"""Runs the full table parsing process.
Args:
@ -129,11 +122,8 @@ def parse_tables(image: np.array, show=False):
image = preprocess(image)
image = isolate_vertical_and_horizontal_components(image)
rects = turn_connected_components_into_rects(image)
#print(rects, "\n\n")
rects = list(map(Rectangle.from_xywh, rects))
#print(rects, "\n\n")
rects = remove_isolated(rects)
#print(rects, "\n\n")
return rects
boxes = turn_connected_components_into_rectangles(image)
rectangles = lmap(box_to_rectangle, boxes)
rectangles = remove_isolated(rectangles)
return rectangles

View File

@ -0,0 +1,51 @@
from functools import reduce
from typing import Iterable
import cv2
import numpy as np
from funcy import first
from cv_analysis.utils.rectangle import Rectangle
def find_contours_and_hierarchies(image):
contours, hierarchies = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
return contours, first(hierarchies) if hierarchies is not None else None
def dilate_page_components(image: np.ndarray) -> np.ndarray:
# FIXME: Parameterize via factory
image = cv2.GaussianBlur(image, (7, 7), 0)
# FIXME: Parameterize via factory
thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# FIXME: Parameterize via factory
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
# FIXME: Parameterize via factory
dilate = cv2.dilate(thresh, kernel, iterations=4)
return dilate
def normalize_to_gray_scale(image: np.ndarray) -> np.ndarray:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if len(image.shape) > 2 else image
return image
def threshold_image(image: np.ndarray) -> np.ndarray:
# FIXME: Parameterize via factory
_, image = cv2.threshold(image, 254, 255, cv2.THRESH_BINARY)
return image
def invert_image(image: np.ndarray):
return ~image
def fill_rectangles(image: np.ndarray, rectangles: Iterable[Rectangle]) -> np.ndarray:
image = reduce(fill_in_component_area, rectangles, image)
return image
def fill_in_component_area(image: np.ndarray, rectangle: Rectangle) -> np.ndarray:
cv2.rectangle(image, (rectangle.x1, rectangle.y1), (rectangle.x2, rectangle.y2), (0, 0, 0), -1)
cv2.rectangle(image, (rectangle.x1, rectangle.y1), (rectangle.x2, rectangle.y2), (255, 255, 255), 7)
return image

View File

@ -1,120 +0,0 @@
from itertools import combinations, starmap, product
from typing import Iterable
def is_near_enough(rect_pair, max_gap=14):
x1, y1, w1, h1 = rect_pair[0]
x2, y2, w2, h2 = rect_pair[1]
return any([abs(x1 - (x2 + w2)) <= max_gap,
abs(x2 - (x1 + w1)) <= max_gap,
abs(y2 - (y1 + h1)) <= max_gap,
abs(y1 - (y2 + h2)) <= max_gap])
def is_overlapping(rect_pair):
x1, y1, w1, h1 = rect_pair[0]
x2, y2, w2, h2 = rect_pair[1]
dx = min(x1 + w1, x2 + w2) - max(x1, x2)
dy = min(y1 + h1, y2 + h2) - max(y1, y2)
return True if (dx >= 0) and (dy >= 0) else False
def is_on_same_line(rect_pair):
x1, y1, w1, h1 = rect_pair[0]
x2, y2, w2, h2 = rect_pair[1]
return any([any([abs(y1 - y2) <= 10,
abs(y1 + h1 - (y2 + h2)) <= 10]),
any([y2 <= y1 and y1 + h1 <= y2 + h2,
y1 <= y2 and y2 + h2 <= y1 + h1])])
def has_correct_position1(rect_pair):
x1, y1, w1, h1 = rect_pair[0]
x2, y2, w2, h2 = rect_pair[1]
return any([any([abs(x1 - x2) <= 10,
abs(y1 - y2) <= 10,
abs(x1 + w1 - (x2 + w2)) <= 10,
abs(y1 + h1 - (y2 + h2)) <= 10]),
any([y2 <= y1 and y1 + h1 <= y2 + h2,
y1 <= y2 and y2 + h2 <= y1 + h1,
x2 <= x1 and x1 + w1 <= x2 + w2,
x1 <= x2 and x2 + w2 <= x1 + w1])])
def is_related(rect_pair):
return (is_near_enough(rect_pair) and has_correct_position1(rect_pair)) or is_overlapping(
rect_pair)
def fuse_rects(rect1, rect2):
if rect1 == rect2:
return rect1
x1, y1, w1, h1 = rect1
x2, y2, w2, h2 = rect2
topleft = list(min(product([x1, x2], [y1, y2])))
bottomright = list(max(product([x1 + w1, x2 + w2], [y1 + h1, y2 + h2])))
w = [bottomright[0] - topleft[0]]
h = [bottomright[1] - topleft[1]]
return tuple(topleft + w + h)
def rects_not_the_same(r):
return r[0] != r[1]
def find_related_rects(rects):
rect_pairs = list(filter(is_related, combinations(rects, 2)))
rect_pairs = list(filter(rects_not_the_same, rect_pairs))
if not rect_pairs:
return [], rects
rel_rects = list(set([rect for pair in rect_pairs for rect in pair]))
unrel_rects = [rect for rect in rects if rect not in rel_rects]
return rect_pairs, unrel_rects
def connect_related_rects(rects):
rects_to_connect, rects_new = find_related_rects(rects)
while len(rects_to_connect) > 0:
rects_fused = list(starmap(fuse_rects, rects_to_connect))
rects_fused = list(dict.fromkeys(rects_fused))
if len(rects_fused) == 1:
rects_new += rects_fused
rects_fused = []
rects_to_connect, connected_rects = find_related_rects(rects_fused)
rects_new += connected_rects
if len(rects_to_connect) > 1 and len(set(rects_to_connect)) == 1:
rects_new.append(rects_fused[0])
rects_to_connect = []
return rects_new
def connect_related_rects2(rects: Iterable[tuple]):
rects = list(rects)
current_idx = 0
while True:
if current_idx + 1 >= len(rects) or len(rects) <= 1:
break
merge_happened = False
current_rect = rects.pop(current_idx)
for idx, maybe_related_rect in enumerate(rects):
if is_related((current_rect, maybe_related_rect)):
current_rect = fuse_rects(current_rect, maybe_related_rect)
rects.pop(idx)
merge_happened = True
break
rects.insert(0, current_rect)
if not merge_happened:
current_idx += 1
elif merge_happened:
current_idx = 0
return rects

View File

@ -0,0 +1,47 @@
import json
from typing import Sequence, Union
import cv2
import numpy as np
from PIL import Image
from cv_analysis.utils.rectangle import Rectangle
Image_t = Union[Image.Image, np.ndarray]
def contour_to_rectangle(contour):
return box_to_rectangle(cv2.boundingRect(contour))
def box_to_rectangle(box: Sequence[int]) -> Rectangle:
x, y, w, h = box
return Rectangle(x, y, x + w, y + h)
def rectangle_to_box(rectangle: Rectangle) -> Sequence[int]:
return [rectangle.x1, rectangle.y1, rectangle.width, rectangle.height]
class RectangleJSONEncoder(json.JSONEncoder):
def __init__(self, *args, **kwargs):
json.JSONEncoder.__init__(self, *args, **kwargs)
self._replacement_map = {}
def default(self, o):
if isinstance(o, Rectangle):
return {"x1": o.x1, "x2": o.x2, "y1": o.y1, "y2": o.y2}
else:
return json.JSONEncoder.default(self, o)
def encode(self, o):
result = json.JSONEncoder.encode(self, o)
return result
def normalize_image_format_to_array(image: Image_t):
return np.array(image).astype(np.uint8) if isinstance(image, Image.Image) else image
def normalize_image_format_to_pil(image: Image_t):
return Image.fromarray(image.astype(np.uint8)) if isinstance(image, np.ndarray) else image

View File

@ -1,33 +1,51 @@
import cv2
import numpy as np
from PIL import Image
from PIL.Image import Image as Image_t
from matplotlib import pyplot as plt
from cv_analysis.utils.conversion import normalize_image_format_to_array
def show_image_cv2(image, maxdim=700):
def show_image(image, backend="mpl", **kwargs):
image = normalize_image_format_to_array(image)
if backend == "mpl":
show_image_mpl(image, **kwargs)
elif backend == "cv2":
show_image_cv2(image, **kwargs)
elif backend == "pil":
Image.fromarray(image).show()
else:
raise ValueError(f"Unknown backend: {backend}")
def show_image_cv2(image, maxdim=700, **kwargs):
h, w, c = image.shape
maxhw = max(h, w)
if maxhw > maxdim:
ratio = maxdim / maxhw
h = int(h * ratio)
w = int(w * ratio)
img = cv2.resize(image, (h, w))
img = cv2.resize(image, (h, w))
cv2.imshow("", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
def show_image_mpl(image):
def show_image_mpl(image, **kwargs):
if isinstance(image, Image_t):
# noinspection PyTypeChecker
image = np.array(image)
# noinspection PyArgumentList
assert image.max() <= 255
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(20, 20)
assert image.dtype == np.uint8
ax.imshow(image, cmap="gray")
ax.title.set_text(kwargs.get("title", ""))
plt.show()
def show_image(image, backend="m"):
if backend.startswith("m"):
show_image_mpl(image)
else:
show_image_cv2(image)
def save_image(image, path):
cv2.imwrite(path, image)

View File

@ -1,19 +1,23 @@
from typing import Union
import cv2
import numpy as np
from PIL import Image
from cv_analysis.utils import copy_and_normalize_channels
def draw_contours(image, contours, color=None, annotate=False):
def draw_contours(image, contours):
image = copy_and_normalize_channels(image)
for cont in contours:
cv2.drawContours(image, cont, -1, (0, 255, 0), 4)
for contour in contours:
cv2.drawContours(image, contour, -1, (0, 255, 0), 4)
return image
def draw_rectangles(image, rectangles, color=None, annotate=False):
def draw_rectangles(image: Union[np.ndarray, Image.Image], rectangles, color=None, annotate=False, filled=False):
def annotate_rect(x, y, w, h):
cv2.putText(
image,
@ -21,18 +25,18 @@ def draw_rectangles(image, rectangles, color=None, annotate=False):
(x + (w // 2) - 12, y + (h // 2) + 9),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
(0, 255, 0, 255),
2,
)
image = copy_and_normalize_channels(image)
if not color:
color = (0, 255, 0)
color = (0, 255, 0, 255)
for rect in rectangles:
x, y, w, h = rect
cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
cv2.rectangle(image, (x, y), (x + w, y + h), color, -1 if filled else 1)
if annotate:
annotate_rect(x, y, w, h)

View File

@ -5,7 +5,7 @@ def is_large_enough(cont, min_area):
return cv2.contourArea(cont, False) > min_area
def is_not_too_large(cnt, max_area):
def is_small_enough(cnt, max_area):
return cv2.contourArea(cnt, False) < max_area

View File

@ -0,0 +1,29 @@
from numpy import array, ndarray
import pdf2image
from PIL import Image
from cv_analysis.utils.preprocessing import preprocess_page_array
def open_analysis_input_file(path_or_bytes, first_page=1, last_page=None):
assert first_page > 0, "Page numbers are 1-based."
assert last_page is None or last_page >= first_page, "last_page must be greater than or equal to first_page."
last_page = last_page or first_page
if type(path_or_bytes) == str:
if path_or_bytes.lower().endswith((".png", ".jpg", ".jpeg")):
pages = [Image.open(path_or_bytes)]
elif path_or_bytes.lower().endswith(".pdf"):
pages = pdf2image.convert_from_path(path_or_bytes, first_page=first_page, last_page=last_page)
else:
raise IOError("Invalid file extension. Accepted filetypes: .png, .jpg, .jpeg, .pdf")
elif type(path_or_bytes) == bytes:
pages = pdf2image.convert_from_bytes(path_or_bytes, first_page=first_page, last_page=last_page)
elif type(path_or_bytes) in {list, ndarray}:
return path_or_bytes
pages = [preprocess_page_array(array(p)) for p in pages]
return pages

View File

@ -0,0 +1,54 @@
from functools import reduce
from itertools import combinations
from typing import List, Tuple, Set
from funcy import all
from cv_analysis.utils import until, make_merger_sentinel
from cv_analysis.utils.rectangle import Rectangle
from cv_analysis.utils.spacial import related
def merge_related_rectangles(rectangles: List[Rectangle]) -> List[Rectangle]:
"""Merges rectangles that are related to each other, iterating on partial merge results until no more mergers are
possible."""
assert isinstance(rectangles, list)
no_new_merges = make_merger_sentinel()
return until(no_new_merges, merge_rectangles_once, rectangles)
def merge_rectangles_once(rectangles: List[Rectangle]) -> List[Rectangle]:
"""Merges rectangles that are related to each other, but does not iterate on the results."""
rectangles = set(rectangles)
merged, used = reduce(merge_if_related, combinations(rectangles, 2), (set(), set()))
return list(merged | rectangles - used)
T = Tuple[Set[Rectangle], Set[Rectangle]]
V = Tuple[Rectangle, Rectangle]
def merge_if_related(merged_and_used_so_far: T, rectangle_pair: V) -> T:
"""Merges two rectangles if they are related, otherwise returns the accumulator unchanged."""
alpha, beta = rectangle_pair
merged, used = merged_and_used_so_far
def unused(*args) -> bool:
return not used & {*args}
if all(unused, (alpha, beta)) and related(alpha, beta):
return merged | {bounding_rect(alpha, beta)}, used | {alpha, beta}
else:
return merged, used
def bounding_rect(alpha: Rectangle, beta: Rectangle) -> Rectangle:
"""Returns the smallest rectangle that contains both rectangles."""
return Rectangle(
min(alpha.x1, beta.x1),
min(alpha.y1, beta.y1),
max(alpha.x2, beta.x2),
max(alpha.y2, beta.y2),
)

View File

@ -0,0 +1,56 @@
from functools import reduce
from operator import itemgetter
from typing import Iterable
import numpy as np
from funcy import lmap, lpluck, first
from cv_analysis.utils import lift
from cv_analysis.utils.rectangle import Rectangle
def compute_document_score(result_dict, ground_truth_dicts):
extract_cells = lambda dicts: lpluck("cells", dicts["pages"])
cells_per_ground_truth_page, cells_per_result_page = map(extract_cells, (ground_truth_dicts, result_dict))
cells_on_page_to_rectangles = lift(rectangle_from_dict)
cells_on_pages_to_rectangles = lift(cells_on_page_to_rectangles)
rectangles_per_ground_truth_page, rectangles_per_result_page = map(
cells_on_pages_to_rectangles, (cells_per_ground_truth_page, cells_per_result_page)
)
scores = lmap(compute_page_iou, rectangles_per_result_page, rectangles_per_ground_truth_page)
n_cells_per_page = np.array(lmap(len, cells_per_ground_truth_page))
document_score = np.average(scores, weights=n_cells_per_page / n_cells_per_page.sum())
return document_score
def rectangle_from_dict(d):
x1, y1, w, h = itemgetter("x", "y", "width", "height")(d)
return Rectangle(x1, y1, x1 + w, y1 + h)
def compute_page_iou(predicted_rectangles: Iterable[Rectangle], true_rectangles: Iterable[Rectangle]):
def find_best_iou(sum_so_far_and_candidate_rectangles, true_rectangle):
sum_so_far, predicted_rectangles = sum_so_far_and_candidate_rectangles
best_match, best_iou = find_max_overlap(true_rectangle, predicted_rectangles)
return sum_so_far + best_iou, predicted_rectangles - {best_match}
predicted_rectangles = set(predicted_rectangles)
true_rectangles = set(true_rectangles)
iou_sum = first(reduce(find_best_iou, true_rectangles, (0, predicted_rectangles)))
normalizing_factor = 1 / max(len(predicted_rectangles), len(true_rectangles))
score = normalizing_factor * iou_sum
return score
def find_max_overlap(rectangle: Rectangle, candidate_rectangles: Iterable[Rectangle]):
best_candidate_rectangle = max(candidate_rectangles, key=rectangle.iou)
iou = rectangle.iou(best_candidate_rectangle)
return best_candidate_rectangle, iou

View File

@ -1,27 +0,0 @@
from numpy import array, ndarray
import pdf2image
from PIL import Image
from cv_analysis.utils.preprocessing import preprocess_page_array
def open_pdf(pdf, first_page=0, last_page=None):
first_page += 1
last_page = None if last_page is None else last_page + 1
if type(pdf) == str:
if pdf.lower().endswith((".png", ".jpg", ".jpeg")):
pages = [Image.open(pdf)]
elif pdf.lower().endswith(".pdf"):
pages = pdf2image.convert_from_path(pdf, first_page=first_page, last_page=last_page)
else:
raise IOError("Invalid file extension. Accepted filetypes:\n\t.png\n\t.jpg\n\t.jpeg\n\t.pdf")
elif type(pdf) == bytes:
pages = pdf2image.convert_from_bytes(pdf, first_page=first_page, last_page=last_page)
elif type(pdf) in {list, ndarray}:
return pdf
pages = [preprocess_page_array(array(p)) for p in pages]
return pages

View File

@ -1,15 +1,15 @@
from collections import namedtuple
from functools import partial
from itertools import starmap, compress
from typing import Iterable, List
from cv_analysis.utils.structures import Rectangle
from typing import Iterable, List, Sequence
from cv_analysis.utils.rectangle import Rectangle
def remove_overlapping(rectangles: Iterable[Rectangle]) -> List[Rectangle]:
def overlap(a: Rectangle, rect2: Rectangle) -> float:
return a.intersection(rect2) > 0
def does_not_overlap(rect: Rectangle, rectangles: Iterable[Rectangle]) -> list:
def does_not_overlap(rect: Rectangle, rectangles: Iterable[Rectangle]) -> bool:
return not any(overlap(rect, rect2) for rect2 in rectangles if not rect == rect2)
rectangles = list(filter(partial(does_not_overlap, rectangles=rectangles), rectangles))
@ -17,15 +17,18 @@ def remove_overlapping(rectangles: Iterable[Rectangle]) -> List[Rectangle]:
def remove_included(rectangles: Iterable[Rectangle]) -> List[Rectangle]:
keep = [rect for rect in rectangles if not rect.is_included(rectangles)]
return keep
rectangles_to_keep = [rect for rect in rectangles if not rect.is_included(rectangles)]
return rectangles_to_keep
def __remove_isolated_unsorted(rectangles: Iterable[Rectangle]) -> List[Rectangle]:
def is_connected(rect: Rectangle, rectangles: Iterable[Rectangle]):
return any(rect.adjacent(rect2) for rect2 in rectangles if not rect == rect2)
rectangles = list(filter(partial(is_connected, rectangles=list(rectangles)), rectangles))
if not isinstance(rectangles, list):
rectangles = list(rectangles)
rectangles = list(filter(partial(is_connected, rectangles=rectangles), rectangles))
return rectangles
@ -42,9 +45,9 @@ def __remove_isolated_sorted(rectangles: Iterable[Rectangle]) -> List[Rectangle]
return rectangles
def remove_isolated(rectangles: Iterable[Rectangle], input_unsorted=True) -> List[Rectangle]:
def remove_isolated(rectangles: Iterable[Rectangle], input_unsorted: bool = True) -> List[Rectangle]:
return (__remove_isolated_unsorted if input_unsorted else __remove_isolated_sorted)(rectangles)
def has_no_parent(hierarchy):
def has_no_parent(hierarchy: Sequence[int]) -> bool:
return hierarchy[-1] <= 0

View File

@ -0,0 +1,85 @@
# See https://stackoverflow.com/a/33533514
from __future__ import annotations
from typing import Iterable, Union
from funcy import identity
from cv_analysis.utils.spacial import adjacent, contains, intersection, iou, area, is_contained
Coord = Union[int, float]
class Rectangle:
def __init__(self, x1, y1, x2, y2, discrete=True):
"""Creates a rectangle from two points."""
nearest_valid = int if discrete else identity
self.__x1 = nearest_valid(x1)
self.__y1 = nearest_valid(y1)
self.__x2 = nearest_valid(x2)
self.__y2 = nearest_valid(y2)
def __repr__(self):
return f"Rectangle({self.x1}, {self.y1}, {self.x2}, {self.y2})"
@property
def x1(self):
return self.__x1
@property
def x2(self):
return self.__x2
@property
def y1(self):
return self.__y1
@property
def y2(self):
return self.__y2
@property
def width(self):
return abs(self.x2 - self.x1)
@property
def height(self):
return abs(self.y2 - self.y1)
@property
def coords(self):
return [self.x1, self.y1, self.x2, self.y2]
def __hash__(self):
return hash((self.x1, self.y1, self.x2, self.y2))
def __iter__(self):
yield self.x1
yield self.y1
yield self.width
yield self.height
def area(self):
"""Calculates the area of this rectangle."""
return area(self)
def intersection(self, other):
"""Calculates the intersection of this and the given other rectangle."""
return intersection(self, other)
def iou(self, other: Rectangle):
"""Calculates the intersection over union of this and the given other rectangle."""
return iou(self, other)
def includes(self, other: Rectangle, tol=3):
"""Checks if this rectangle contains the given other."""
return contains(self, other, tol)
def is_included(self, rectangles: Iterable[Rectangle]):
"""Checks if this rectangle is contained by any of the given rectangles."""
return is_contained(self, rectangles)
def adjacent(self, other: Rectangle, tolerance=7):
"""Checks if this rectangle is adjacent to the given other."""
return adjacent(self, other, tolerance)

View File

@ -0,0 +1,286 @@
# See https://stackoverflow.com/a/39757388
from __future__ import annotations
from functools import lru_cache
from operator import attrgetter
from typing import TYPE_CHECKING, Iterable
from funcy import juxt, rpartial, compose, lflatten, first, second
from cv_analysis.utils import lift
if TYPE_CHECKING:
from cv_analysis.utils.rectangle import Rectangle
def adjacent(alpha: Rectangle, beta: Rectangle, tolerance=7, strict=False):
"""Checks if the two rectangles are adjacent to each other.
Args:
alpha: The first rectangle.
beta: The second rectangle.
tolerance: The maximum distance between the two rectangles.
strict: If True, the rectangles must be adjacent along one axis and contained within the other axis. Else, the
rectangles must be adjacent along one axis and overlapping the other axis.
Returns:
True if the two rectangles are adjacent to each other, False otherwise.
"""
select_strictness_variant = first if strict else second
test_candidates = [
# +---+
# | | +---+
# | a | | b |
# | | +___+
# +___+
(right_left_aligned_and_vertically_contained, right_left_aligned_and_vertically_overlapping),
# +---+
# +---+ | |
# | b | | a |
# +___+ | |
# +___+
(left_right_aligned_and_vertically_contained, left_right_aligned_and_vertically_overlapping),
# +-----------+
# | a |
# +___________+
# +-----+
# | b |
# +_____+
(bottom_top_aligned_and_horizontally_contained, bottom_top_aligned_and_horizontally_overlapping),
# +-----+
# | b |
# +_____+
# +-----------+
# | a |
# +___________+
(top_bottom_aligned_and_horizontally_contained, top_bottom_aligned_and_horizontally_overlapping),
]
tests = map(select_strictness_variant, test_candidates)
return any(juxt(*tests)(alpha, beta, tolerance))
def right_left_aligned_and_vertically_overlapping(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is left of the other within a tolerance and also overlaps the other's y range."""
return adjacent_along_one_axis_and_overlapping_along_perpendicular_axis(
alpha.x2, beta.x1, beta.y1, beta.y2, alpha.y1, alpha.y2, tolerance=tol
)
def left_right_aligned_and_vertically_overlapping(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is right of the other within a tolerance and also overlaps the other's y range."""
return adjacent_along_one_axis_and_overlapping_along_perpendicular_axis(
alpha.x1, beta.x2, beta.y1, beta.y2, alpha.y1, alpha.y2, tolerance=tol
)
def bottom_top_aligned_and_horizontally_overlapping(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is above the other within a tolerance and also overlaps the other's x range."""
return adjacent_along_one_axis_and_overlapping_along_perpendicular_axis(
alpha.y2, beta.y1, beta.x1, beta.x2, alpha.x1, alpha.x2, tolerance=tol
)
def top_bottom_aligned_and_horizontally_overlapping(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is below the other within a tolerance and also overlaps the other's x range."""
return adjacent_along_one_axis_and_overlapping_along_perpendicular_axis(
alpha.y1, beta.y2, beta.x1, beta.x2, alpha.x1, alpha.x2, tolerance=tol
)
def right_left_aligned_and_vertically_contained(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is left of the other within a tolerance and also contains the other's y range."""
return adjacent_along_one_axis_and_contained_within_perpendicular_axis(
alpha.x2, beta.x1, beta.y1, beta.y2, alpha.y1, alpha.y2, tolerance=tol
)
def left_right_aligned_and_vertically_contained(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is right of the other within a tolerance and also contains the other's y range."""
return adjacent_along_one_axis_and_contained_within_perpendicular_axis(
alpha.x1, beta.x2, beta.y1, beta.y2, alpha.y1, alpha.y2, tolerance=tol
)
def bottom_top_aligned_and_horizontally_contained(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is above the other within a tolerance and also contains the other's x range."""
return adjacent_along_one_axis_and_contained_within_perpendicular_axis(
alpha.y2, beta.y1, beta.x1, beta.x2, alpha.x1, alpha.x2, tolerance=tol
)
def top_bottom_aligned_and_horizontally_contained(alpha: Rectangle, beta: Rectangle, tol):
"""Checks if the first rectangle is below the other within a tolerance and also contains the other's x range."""
return adjacent_along_one_axis_and_contained_within_perpendicular_axis(
alpha.y1, beta.y2, beta.x1, beta.x2, alpha.x1, alpha.x2, tolerance=tol
)
def adjacent_along_one_axis_and_overlapping_along_perpendicular_axis(
axis_0_point_1,
axis_1_point_2,
axis_1_contained_point_1,
axis_1_contained_point_2,
axis_1_lower_bound,
axis_1_upper_bound,
tolerance,
):
"""Checks if two points are adjacent along one axis and two other points overlap a range along the perpendicular
axis.
"""
return adjacent_along_one_axis_and_overlapping_or_contained_along_perpendicular_axis(
axis_0_point_1,
axis_1_point_2,
axis_1_contained_point_1,
axis_1_contained_point_2,
axis_1_lower_bound,
axis_1_upper_bound,
tolerance,
mode="overlapping",
)
def adjacent_along_one_axis_and_contained_within_perpendicular_axis(
axis_0_point_1,
axis_1_point_2,
axis_1_contained_point_1,
axis_1_contained_point_2,
axis_1_lower_bound,
axis_1_upper_bound,
tolerance,
):
"""Checks if two points are adjacent along one axis and two other points overlap a range along the perpendicular
axis.
"""
return adjacent_along_one_axis_and_overlapping_or_contained_along_perpendicular_axis(
axis_0_point_1,
axis_1_point_2,
axis_1_contained_point_1,
axis_1_contained_point_2,
axis_1_lower_bound,
axis_1_upper_bound,
tolerance,
mode="contained",
)
def adjacent_along_one_axis_and_overlapping_or_contained_along_perpendicular_axis(
axis_0_point_1,
axis_1_point_2,
axis_1_contained_point_1,
axis_1_contained_point_2,
axis_1_lower_bound,
axis_1_upper_bound,
tolerance,
mode,
):
"""Checks if two points are adjacent along one axis and two other points overlap a range along the perpendicular
axis or are contained in that range, depending on the mode specified.
"""
assert mode in ["overlapping", "contained"]
quantifier = any if mode == "overlapping" else all
return all(
[
abs(axis_0_point_1 - axis_1_point_2) <= tolerance,
quantifier(
[
axis_1_lower_bound <= p <= axis_1_upper_bound
for p in [axis_1_contained_point_1, axis_1_contained_point_2]
]
),
]
)
def contains(alpha: Rectangle, beta: Rectangle, tol=3):
"""Checks if the first rectangle contains the second rectangle."""
return (
beta.x1 + tol >= alpha.x1
and beta.y1 + tol >= alpha.y1
and beta.x2 - tol <= alpha.x2
and beta.y2 - tol <= alpha.y2
)
def is_contained(rectangle: Rectangle, rectangles: Iterable[Rectangle]):
"""Checks if the rectangle is contained within any of the other rectangles."""
other_rectangles = filter(lambda r: r != rectangle, rectangles)
return any(map(rpartial(contains, rectangle), other_rectangles))
def iou(alpha: Rectangle, beta: Rectangle):
"""Calculates the intersection area over the union area of two rectangles."""
return intersection(alpha, beta) / union(alpha, beta)
def area(rectangle: Rectangle):
"""Calculates the area of a rectangle."""
return abs((rectangle.x2 - rectangle.x1) * (rectangle.y2 - rectangle.y1))
def union(alpha: Rectangle, beta: Rectangle):
"""Calculates the union area of two rectangles."""
return area(alpha) + area(beta) - intersection(alpha, beta)
@lru_cache(maxsize=1000)
def intersection(alpha, beta):
"""Calculates the intersection of two rectangles."""
return intersection_along_x_axis(alpha, beta) * intersection_along_y_axis(alpha, beta)
def intersection_along_x_axis(alpha, beta):
"""Calculates the intersection along the x-axis."""
return intersection_along_axis(alpha, beta, "x")
def intersection_along_y_axis(alpha, beta):
"""Calculates the intersection along the y-axis."""
return intersection_along_axis(alpha, beta, "y")
def intersection_along_axis(alpha, beta, axis):
"""Calculates the intersection along the given axis.
Cases:
a b
[-----] (---) ==> [a1, b1, a2, b2] ==> max(0, (a2 - b1)) = 0
b a
(---) [-----] ==> [b1, a1, b2, a2] ==> max(0, (b2 - a1)) = 0
a b
[--(----]----) ==> [a1, b1, a2, b2] ==> max(0, (a2 - b1)) = (a2 - b1)
a b
(-[---]----) ==> [b1, a1, a2, b2] ==> max(0, (a2 - a1)) = (a2 - a1)
b a
[-(---)----] ==> [a1, b1, b2, a2] ==> max(0, (b2 - b1)) = (b2 - b1)
b a
(----[--)----] ==> [b1, a1, b2, a2] ==> max(0, (b2 - a1)) = (b2 - a1)
"""
assert axis in ["x", "y"]
def get_component_accessor(component):
"""Returns a function that accesses the given component of a rectangle."""
return attrgetter(f"{axis}{component}")
def make_access_components_and_sort_fn(component):
"""Returns a function that accesses and sorts the given component of multiple rectangles."""
assert component in [1, 2]
return compose(sorted, lift(get_component_accessor(component)))
sort_first_components, sort_second_components = map(make_access_components_and_sort_fn, [1, 2])
min_c1, max_c1, min_c2, max_c2 = lflatten(juxt(sort_first_components, sort_second_components)((alpha, beta)))
intersection = max(0, min_c2 - max_c1)
return intersection
def related(alpha: Rectangle, beta: Rectangle):
return close(alpha, beta) or overlap(alpha, beta)
def close(alpha: Rectangle, beta: Rectangle, max_gap=14):
# FIXME: Parameterize via factory
return adjacent(alpha, beta, tolerance=max_gap, strict=True)
def overlap(alpha: Rectangle, beta: Rectangle):
return intersection(alpha, beta) > 0

View File

@ -1,131 +0,0 @@
from json import dumps
from typing import Iterable
import numpy as np
from funcy import identity
class Rectangle:
def __init__(self, x1=None, y1=None, w=None, h=None, x2=None, y2=None, indent=4, format="xywh", discrete=True):
make_discrete = int if discrete else identity
try:
self.x1 = make_discrete(x1)
self.y1 = make_discrete(y1)
self.w = make_discrete(w) if w else make_discrete(x2 - x1)
self.h = make_discrete(h) if h else make_discrete(y2 - y1)
self.x2 = make_discrete(x2) if x2 else self.x1 + self.w
self.y2 = make_discrete(y2) if y2 else self.y1 + self.h
assert np.isclose(self.x1 + self.w, self.x2)
assert np.isclose(self.y1 + self.h, self.y2)
self.indent = indent
self.format = format
except Exception as err:
raise Exception("x1, y1, (w|x2), and (h|y2) must be defined.") from err
def json_xywh(self):
return {"x": self.x1, "y": self.y1, "width": self.w, "height": self.h}
def json_xyxy(self):
return {"x1": self.x1, "y1": self.y1, "x2": self.x2, "y2": self.y2}
def json_full(self):
# TODO: can we make all coords x0, y0 based? :)
return {
"x0": self.x1,
"y0": self.y1,
"x1": self.x2,
"y1": self.y2,
"width": self.w,
"height": self.h,
}
def json(self):
json_func = {"xywh": self.json_xywh, "xyxy": self.json_xyxy}.get(self.format, self.json_full)
return json_func()
def xyxy(self):
return self.x1, self.y1, self.x2, self.y2
def xywh(self):
return self.x1, self.y1, self.w, self.h
def intersection(self, rect):
bx1, by1, bx2, by2 = rect.xyxy()
if (self.x1 > bx2) or (bx1 > self.x2) or (self.y1 > by2) or (by1 > self.y2):
return 0
intersection_ = (min(self.x2, bx2) - max(self.x1, bx1)) * (min(self.y2, by2) - max(self.y1, by1))
return intersection_
def area(self):
return (self.x2 - self.x1) * (self.y2 - self.y1)
def iou(self, rect):
intersection = self.intersection(rect)
if intersection == 0:
return 0
union = self.area() + rect.area() - intersection
return intersection / union
def includes(self, other: "Rectangle", tol=3):
"""does a include b?"""
return (
other.x1 + tol >= self.x1
and other.y1 + tol >= self.y1
and other.x2 - tol <= self.x2
and other.y2 - tol <= self.y2
)
def is_included(self, rectangles: Iterable["Rectangle"]):
return any(rect.includes(self) for rect in rectangles if not rect == self)
def adjacent(self, rect2: "Rectangle", tolerance=7):
# tolerance=1 was set too low; most lines are 2px wide
def adjacent2d(sixtuple):
g, h, i, j, k, l = sixtuple
return (abs(g - h) <= tolerance) and any(k <= p <= l for p in [i, j])
if rect2 is None:
return False
return any(
map(
adjacent2d,
[
(self.x2, rect2.x1, rect2.y1, rect2.y2, self.y1, self.y2),
(self.x1, rect2.x2, rect2.y1, rect2.y2, self.y1, self.y2),
(self.y2, rect2.y1, rect2.x1, rect2.x2, self.x1, self.x2),
(self.y1, rect2.y2, rect2.x1, rect2.x2, self.x1, self.x2),
],
)
)
@classmethod
def from_xyxy(cls, xyxy_tuple, discrete=True):
x1, y1, x2, y2 = xyxy_tuple
return cls(x1=x1, y1=y1, x2=x2, y2=y2, discrete=discrete)
@classmethod
def from_xywh(cls, xywh_tuple, discrete=True):
x, y, w, h = xywh_tuple
return cls(x1=x, y1=y, w=w, h=h, discrete=discrete)
@classmethod
def from_dict_xywh(cls, xywh_dict, discrete=True):
return cls(x1=xywh_dict["x"], y1=xywh_dict["y"], w=xywh_dict["width"], h=xywh_dict["height"], discrete=discrete)
def __str__(self):
return dumps(self.json(), indent=self.indent)
def __repr__(self):
return str(self.json())
def __iter__(self):
return list(self.json().values()).__iter__()
def __eq__(self, rect):
return all([self.x1 == rect.x1, self.y1 == rect.y1, self.w == rect.w, self.h == rect.h])
class Contour:
def __init__(self):
pass

View File

@ -1,61 +0,0 @@
from typing import Iterable
import numpy as np
from cv_analysis.utils.structures import Rectangle
def find_max_overlap(box: Rectangle, box_list: Iterable[Rectangle]):
best_candidate = max(box_list, key=lambda x: box.iou(x))
iou = box.iou(best_candidate)
return best_candidate, iou
def compute_page_iou(results_boxes: Iterable[Rectangle], ground_truth_boxes: Iterable[Rectangle]):
results = list(results_boxes)
truth = list(ground_truth_boxes)
if (not results) or (not truth):
return 0
iou_sum = 0
denominator = max(len(results), len(truth))
while results and truth:
gt_box = truth.pop()
best_match, best_iou = find_max_overlap(gt_box, results)
results.remove(best_match)
iou_sum += best_iou
score = iou_sum / denominator
return score
def compute_document_score(results_dict, annotation_dict):
page_weights = np.array([len(page["cells"]) for page in annotation_dict["pages"]])
page_weights = page_weights / sum(page_weights)
scores = []
for i in range(len(annotation_dict["pages"])):
scores.append(
compute_page_iou(
map(Rectangle.from_dict_xywh, results_dict["pages"][i]["cells"]),
map(Rectangle.from_dict_xywh, annotation_dict["pages"][i]["cells"]),
)
)
doc_score = np.average(np.array(scores), weights=page_weights)
return doc_score
"""
from cv_analysis.utils.test_metrics import *
r1 = Rectangle.from_dict_xywh({'x': 30, 'y': 40, 'width': 50, 'height': 60})
r2 = Rectangle.from_dict_xywh({'x': 40, 'y': 30, 'width': 55, 'height': 65})
r3 = Rectangle.from_dict_xywh({'x': 45, 'y': 35, 'width': 45, 'height': 55})
r4 = Rectangle.from_dict_xywh({'x': 25, 'y': 45, 'width': 45, 'height': 55})
d1 = {"pages": [{"cells": [r1.json_xywh(), r2.json_xywh()]}]}
d2 = {"pages": [{"cells": [r3.json_xywh(), r4.json_xywh()]}]}
compute_iou_from_boxes(r1, r2)
find_max_overlap(r1, [r2, r3, r4])
compute_page_iou([r1, r2], [r3, r4])
compute_document_score(d1, d2)
"""

View File

@ -1,9 +1,17 @@
from numpy import generic
from __future__ import annotations
import cv2
import numpy as np
from PIL import Image
from funcy import first, iterate, keep
from numpy import generic
def copy_and_normalize_channels(image):
if isinstance(image, Image.Image):
image = np.array(image)
image = image.copy()
try:
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
@ -17,3 +25,55 @@ def npconvert(ob):
if isinstance(ob, generic):
return ob.item()
raise TypeError
def lift(fn):
def lifted(coll):
yield from map(fn, coll)
return lifted
def star(fn):
def starred(args):
return fn(*args)
return starred
def lstarkeep(fn, coll):
return list(starkeep(fn, coll))
def starkeep(fn, coll):
yield from keep(star(fn), coll)
def until(cond, func, *args, **kwargs):
return first(filter(cond, iterate(func, *args, **kwargs)))
def conj(x, xs):
return [x, *xs]
def rconj(xs, x):
return [*xs, x]
def make_merger_sentinel():
def no_new_mergers(records):
nonlocal number_of_records_so_far
number_of_records_now = len(records)
if number_of_records_now == number_of_records_so_far:
return True
else:
number_of_records_so_far = number_of_records_now
return False
number_of_records_so_far = -1
return no_new_mergers

1169
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -36,6 +36,19 @@ loguru = "^0.6.0"
pytest = "^7.0.1"
[tool.poetry.group.test.dependencies]
albumentations = "^1.3.0"
faker = "^16.4.0"
pandas = "^1.5.2"
pytablewriter = "^0.64.2"
dataframe-image = "^0.1.5"
blend-modes = "^2.1.0"
[tool.poetry.group.dev.dependencies]
ipython = "^8.9.0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

View File

@ -1,50 +1,75 @@
"""
Usage:
python scripts/annotate.py /home/iriley/Documents/pdf/scanned/10.pdf 5 --type table --show
python scripts/annotate.py /home/iriley/Documents/pdf/scanned/10.pdf 5 --type redaction --show
python scripts/annotate.py /home/iriley/Documents/pdf/scanned/10.pdf 5 --type layout --show
python scripts/annotate.py /home/iriley/Documents/pdf/scanned/10.pdf 5 --type figure --show
"""
import argparse
import loguru
from cv_analysis.figure_detection.figure_detection import detect_figures
from cv_analysis.layout_parsing import parse_layout
from cv_analysis.redaction_detection import find_redactions
from cv_analysis.table_parsing import parse_tables
from cv_analysis.utils.display import show_image
from cv_analysis.utils.draw import draw_contours, draw_rectangles
from cv_analysis.utils.open_pdf import open_pdf
from cv_analysis.utils.visual_logging import vizlogger
from cv_analysis.utils.drawing import draw_contours, draw_rectangles
from cv_analysis.utils.input import open_analysis_input_file
def parse_args():
parser = argparse.ArgumentParser()
parser = argparse.ArgumentParser(
description="Annotate PDF pages with detected elements. Specified pages form a closed interval and are 1-based."
)
parser.add_argument("pdf_path")
parser.add_argument("--page_index", type=int, default=0)
parser.add_argument("--type", choices=["table", "redaction", "layout", "figure"], default="table")
parser.add_argument("--show", action="store_true", default=False)
parser.add_argument(
"--first_page",
"-f",
type=int,
default=1,
)
parser.add_argument(
"-last_page",
"-l",
help="if not specified, defaults to the value of the first page specified",
type=int,
default=None,
)
parser.add_argument(
"--type",
"-t",
help="element type to look for and analyze",
choices=["table", "redaction", "layout", "figure"],
default="table",
)
parser.add_argument("--page", "-p", type=int, default=1)
args = parser.parse_args()
return args
def annotate_page(page_image, analysis_function, drawing_function, name="tmp.png", show=True):
result = analysis_function(page_image)
page_image = drawing_function(page_image, result)
vizlogger.debug(page_image, name)
def annotate_page(page_image, analysis_fn, draw_fn):
result = analysis_fn(page_image)
page_image = draw_fn(page_image, result)
show_image(page_image)
if __name__ == "__main__":
args = parse_args()
page = open_pdf(args.pdf_path, first_page=args.page_index, last_page=args.page_index)[0]
name = f"{args.type}_final_result.png"
draw = draw_rectangles
if args.type == "table":
from cv_analysis.table_parsing import parse_tables as analyze
elif args.type == "redaction":
from cv_analysis.redaction_detection import find_redactions as analyze
def get_analysis_and_draw_fn_for_type(element_type):
analysis_fn, draw_fn = {
"table": (parse_tables, draw_rectangles),
"redaction": (find_redactions, draw_contours),
"layout": (parse_layout, draw_rectangles),
"figure": (detect_figures, draw_rectangles),
}[element_type]
draw = draw_contours
elif args.type == "layout":
from cv_analysis.layout_parsing import parse_layout as analyze
elif args.type == "figure":
from cv_analysis.figure_detection.figure_detection import detect_figures
analyze = detect_figures
annotate_page(page, analyze, draw, name=name, show=args.show)
return analysis_fn, draw_fn
def main(args):
loguru.logger.info(f"Annotating {args.type}s in {args.pdf_path}...")
pages = open_analysis_input_file(args.pdf_path, first_page=args.first_page, last_page=args.last_page)
for page in pages:
analysis_fn, draw_fn = get_analysis_and_draw_fn_for_type(args.type)
annotate_page(page, analysis_fn, draw_fn)
if __name__ == "__main__":
try:
main(parse_args())
except KeyboardInterrupt:
pass

View File

@ -10,7 +10,7 @@ from funcy import lmap
from cv_analysis.figure_detection.figure_detection import detect_figures
from cv_analysis.layout_parsing import parse_layout
from cv_analysis.table_parsing import parse_tables
from cv_analysis.utils.draw import draw_rectangles
from cv_analysis.utils.drawing import draw_rectangles
from pdf2img.conversion import convert_pages_to_images

View File

@ -2,28 +2,27 @@ import argparse
import json
from pathlib import Path
from cv_analysis.server.pipeline import get_analysis_pipeline
from loguru import logger
from cv_analysis.server.pipeline import make_analysis_pipeline_for_element_type
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("pdf")
parser.add_argument("--type", "-t", choices=["table", "layout", "figure"], required=True)
parser.add_argument("pdf", type=Path)
parser.add_argument("--element_type", "-t", choices=["table", "figure"], required=True)
return parser.parse_args()
def main(args):
analysis_fn = make_analysis_pipeline_for_element_type(args.element_type)
logger.info(f"Analysing document for {args.element_type}s...")
results = list(analysis_fn(args.pdf.read_bytes()))
print(json.dumps(results, indent=2))
if __name__ == "__main__":
args = parse_args()
analysis_fn = get_analysis_pipeline(args.type)
with open(args.pdf, "rb") as f:
pdf_bytes = f.read()
results = list(analysis_fn(pdf_bytes))
folder = Path(args.pdf).parent
file_stem = Path(args.pdf).stem
with open(f"{folder}/{file_stem}_{args.type}.json", "w+") as f:
json.dump(results, f, indent=2)
main(parse_args())

View File

@ -4,7 +4,7 @@ import logging
from operator import itemgetter
from cv_analysis.config import get_config
from cv_analysis.server.pipeline import get_analysis_pipeline
from cv_analysis.server.pipeline import make_analysis_pipeline_for_segment_type
from cv_analysis.utils.banner import make_art
from pyinfra import config as pyinfra_config
from pyinfra.queue.queue_manager import QueueManager
@ -31,7 +31,10 @@ def analysis_callback(queue_message: dict):
should_publish_result = True
object_bytes = gzip.decompress(storage.get_object(bucket, object_name))
analysis_fn = get_analysis_pipeline(operation, CV_CONFIG.table_parsing_skip_pages_without_images)
analysis_fn = make_analysis_pipeline_for_segment_type(
operation,
skip_pages_without_images=CV_CONFIG.table_parsing_skip_pages_without_images,
)
results = analysis_fn(object_bytes)
response = {**queue_message, "data": list(results)}

View File

@ -1,6 +1,11 @@
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
pytest_plugins = [
"test.fixtures.table_parsing",
"test.fixtures.figure_detection",
"test.fixtures.page_generation.page",
]

8
test/data/paper/.gitignore vendored Normal file
View File

@ -0,0 +1,8 @@
/crumpled_paper.jpg
/digital_paper.jpg
/gray_paper.jpg
/rough_grain_paper.jpg
/crumpled.jpg
/digital.jpg
/plain.jpg
/rough_grain.jpg

View File

@ -0,0 +1,4 @@
outs:
- md5: d38ebef85a0689bfd047edc98e4a5f93
size: 14131338
path: crumpled.jpg

View File

@ -0,0 +1,4 @@
outs:
- md5: 8c4c96efe26731e14dd4a307dad718fd
size: 108546
path: digital.jpg

View File

@ -0,0 +1,4 @@
outs:
- md5: 33741812aaff0e54849c5128ae2dccf4
size: 6924421
path: plain.jpg

View File

@ -0,0 +1,4 @@
outs:
- md5: eb62925241917d55db05e07851f3f6b9
size: 1679152
path: rough_grain.jpg

View File

1514
test/fixtures/page_generation/page.py vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@ -6,7 +6,7 @@ import cv2
import pytest
from funcy import first
from cv_analysis.utils.structures import Rectangle
from cv_analysis.utils.rectangle import Rectangle
@pytest.fixture

View File

@ -9,8 +9,8 @@ from loguru import logger
from cv_analysis.config import get_config
from cv_analysis.locations import REPO_ROOT_PATH, TEST_DATA_DVC
from cv_analysis.utils.draw import draw_rectangles
from cv_analysis.utils.open_pdf import open_pdf
from cv_analysis.utils.drawing import draw_rectangles
from cv_analysis.utils.input import open_analysis_input_file
from test.fixtures.figure_detection import paste_text
CV_CONFIG = get_config()
@ -19,7 +19,7 @@ CV_CONFIG = get_config()
@pytest.fixture
def client_page_with_table(test_file_index, dvc_test_data):
img_path = join(CV_CONFIG.test_data_dir, f"test{test_file_index}.png")
return first(open_pdf(img_path))
return first(open_analysis_input_file(img_path))
@pytest.fixture(scope="session")

View File

@ -0,0 +1,6 @@
from cv_analysis.utils.display import show_image
def test_blank_page(page_with_content):
pass
# show_image(blank_page)

View File

@ -3,6 +3,7 @@ from math import prod
import cv2
import pytest
from cv_analysis.utils.spacial import area
from test.utils.utils import powerset
@ -15,21 +16,20 @@ class TestFindPrimaryTextRegions:
@pytest.mark.parametrize("image_size", [(200, 200), (500, 500), (800, 800)])
def test_page_without_text_yields_figures(self, figure_detection_pipeline, page_with_images, image_size):
results = figure_detection_pipeline(page_with_images)
result_figures_size = map(lambda x: (x.w, x.h), results)
result_rectangles = figure_detection_pipeline(page_with_images)
result_figure_sizes = map(lambda r: (r.width, r.height), result_rectangles)
assert all([image_size[0] < res[0] and image_size[1] < res[1] for res in result_figures_size])
assert all([image_size[0] < res[0] and image_size[1] < res[1] for res in result_figure_sizes])
@pytest.mark.parametrize("font_scale", [1, 1.5, 2])
@pytest.mark.parametrize("font_style", [cv2.FONT_HERSHEY_SIMPLEX, cv2.FONT_HERSHEY_COMPLEX])
@pytest.mark.parametrize("text_types", powerset(["body", "header", "caption"]))
@pytest.mark.parametrize("error_tolerance", [0.025])
def test_page_with_only_text_yields_no_figures(self, figure_detection_pipeline, page_with_text, error_tolerance):
results = figure_detection_pipeline(page_with_text)
result_figures_area = sum(map(lambda x: (x.w * x.h), results))
result_rectangles = figure_detection_pipeline(page_with_text)
result_figure_areas = sum(map(area, result_rectangles))
page_area = prod(page_with_text.shape)
error = result_figures_area / page_area
error = result_figure_areas / page_area
assert error <= error_tolerance
@ -45,11 +45,11 @@ class TestFindPrimaryTextRegions:
image_size,
error_tolerance,
):
results = list(figure_detection_pipeline(page_with_images_and_text))
result_rectangles = list(figure_detection_pipeline(page_with_images_and_text))
result_figures_area = sum(map(lambda x: (x.w * x.h), results))
result_figure_areas = sum(map(area, result_rectangles))
expected_figure_area = prod(image_size)
error = abs(result_figures_area - expected_figure_area) / expected_figure_area
error = abs(result_figure_areas - expected_figure_area) / expected_figure_area
assert error <= error_tolerance

View File

View File

@ -3,12 +3,11 @@ import numpy as np
import pytest
from cv_analysis.server.pipeline import table_parsing_formatter, figure_detection_formatter, make_analysis_pipeline
from cv_analysis.utils.structures import Rectangle
from cv_analysis.utils.rectangle import Rectangle
def analysis_fn_mock(image: np.ndarray):
bbox = (0, 0, 42, 42)
return [Rectangle.from_xyxy(bbox)]
return [Rectangle(0, 0, 42, 42)]
@pytest.fixture

View File

@ -2,9 +2,12 @@ from itertools import starmap
import cv2
import pytest
from funcy import lmap, compose, zipdict
from cv_analysis.table_parsing import parse_tables
from cv_analysis.utils.test_metrics import compute_document_score
from cv_analysis.utils import lift
from cv_analysis.utils.rectangle import Rectangle
from cv_analysis.utils.metrics import compute_document_score
@pytest.mark.parametrize("score_threshold", [0.95])
@ -12,8 +15,9 @@ from cv_analysis.utils.test_metrics import compute_document_score
def test_table_parsing_on_client_pages(
score_threshold, client_page_with_table, expected_table_annotation, test_file_index
):
result = [x.json_xywh() for x in parse_tables(client_page_with_table)]
formatted_result = {"pages": [{"page": str(test_file_index), "cells": result}]}
results = compose(lift(rectangle_to_dict), parse_tables)(client_page_with_table)
formatted_result = {"pages": [{"cells": results}]}
score = compute_document_score(formatted_result, expected_table_annotation)
@ -25,6 +29,14 @@ def error_tolerance(line_thickness):
return line_thickness * 7
def rectangle_to_dict(rectangle: Rectangle):
return zipdict(["x", "y", "width", "height"], rectangle_to_xywh(rectangle))
def rectangle_to_xywh(rectangle: Rectangle):
return rectangle.x1, rectangle.y1, abs(rectangle.x1 - rectangle.x2), abs(rectangle.y1 - rectangle.y2)
@pytest.mark.parametrize("line_thickness", [1, 2, 3])
@pytest.mark.parametrize("line_type", [cv2.LINE_4, cv2.LINE_AA, cv2.LINE_8])
@pytest.mark.parametrize("table_style", ["closed horizontal vertical", "open horizontal vertical"])
@ -32,7 +44,7 @@ def error_tolerance(line_thickness):
@pytest.mark.parametrize("background_color", [255, 220])
@pytest.mark.parametrize("table_shape", [(5, 8)])
def test_table_parsing_on_generic_pages(page_with_table, expected_gold_page_with_table, error_tolerance):
result = [x.xywh() for x in parse_tables(page_with_table)]
result = lmap(rectangle_to_xywh, parse_tables(page_with_table))
assert (
result == expected_gold_page_with_table
or average_error(result, expected_gold_page_with_table) <= error_tolerance
@ -46,8 +58,8 @@ def test_table_parsing_on_generic_pages(page_with_table, expected_gold_page_with
@pytest.mark.parametrize("background_color", [255, 220])
@pytest.mark.parametrize("table_shape", [(5, 8)])
@pytest.mark.xfail
def test_bad_qual_table(page_with_patchy_table, expected_gold_page_with_table, error_tolerance):
result = [x.xywh() for x in parse_tables(page_with_patchy_table)]
def test_low_quality_table(page_with_patchy_table, expected_gold_page_with_table, error_tolerance):
result = lmap(rectangle_to_xywh, parse_tables(page_with_patchy_table))
assert (
result == expected_gold_page_with_table
or average_error(result, expected_gold_page_with_table) <= error_tolerance