python - Algorithm to group sets of points together that follow a direction -
note: placing question in both matlab , python tags proficient in these languages. however, welcome solutions in language.
question preamble
i have taken image fisheye lens. image consists of pattern bunch of square objects. want image detect centroid of each of these squares, use these points perform undistortion of image - specifically, seeking right distortion model parameters. should noted not of squares need detected. long majority of them are, that's totally fine.... isn't point of post. parameter estimation algorithm have written, problem requires points appear collinear in image.
the base question want ask given these points, best way group them each group consists of horizontal line or vertical line?
background problem
this isn't important regards question i'm asking, if you'd know got data , further understand question i'm asking, please read. if you're not interested, can skip right problem setup section below.
an example of image dealing shown below:
it 960 x 960 image. image higher resolution, subsample image facilitate faster processing time. can see, there bunch of square patterns dispersed in image. also, centroids have calculated respect above subsampled image.
the pipeline have set retrieve centroids following:
- perform canny edge detection
- focus on region of interest minimizes false positives. region of interest squares without of black tape covers 1 of sides.
- find distinct closed contours
for each distinct closed contour...
a. perform harris corner detection
b. determine if result has 4 corner points
c. if does, contour belonged square , find centroid of shape
d. if doesn't, skip shape
place detected centroids step #4 matrix further examination.
here's example result above image. each detected square has 4 points colour coded according location of respect square itself. each centroid have detected, write id right centroid in image itself.
with above image, there 37 detected squares.
problem setup
suppose have image pixel points stored in n x 3
matrix. first 2 columns x
(horizontal) , y
(vertical) coordinates in image coordinate space, y
coordinate inverted, means positive y
moves downwards. third column id associated point.
here code written in matlab takes these points, plots them onto 2d grid , labels each point third column of matrix. if read above background, these points detected algorithm outlined above.
data = [ 475. , 605.75, 1.; 571. , 586.5 , 2.; 233. , 558.5 , 3.; 669.5 , 562.75, 4.; 291.25, 546.25, 5.; 759. , 536.25, 6.; 362.5 , 531.5 , 7.; 448. , 513.5 , 8.; 834.5 , 510. , 9.; 897.25, 486. , 10.; 545.5 , 491.25, 11.; 214.5 , 481.25, 12.; 271.25, 463. , 13.; 646.5 , 466.75, 14.; 739. , 442.75, 15.; 340.5 , 441.5 , 16.; 817.75, 421.5 , 17.; 423.75, 417.75, 18.; 202.5 , 406. , 19.; 519.25, 392.25, 20.; 257.5 , 382. , 21.; 619.25, 368.5 , 22.; 148. , 359.75, 23.; 324.5 , 356. , 24.; 713. , 347.75, 25.; 195. , 335. , 26.; 793.5 , 332.5 , 27.; 403.75, 328. , 28.; 249.25, 308. , 29.; 495.5 , 300.75, 30.; 314. , 279. , 31.; 764.25, 249.5 , 32.; 389.5 , 249.5 , 33.; 475. , 221.5 , 34.; 565.75, 199. , 35.; 802.75, 173.75, 36.; 733. , 176.25, 37.]; figure; hold on; axis ij; scatter(data(:,1), data(:,2),40, 'r.'); text(data(:,1)+10, data(:,2)+10, num2str(data(:,3)));
similarly in python, using numpy
, matplotlib
, have:
import numpy np import matplotlib.pyplot plt data = np.array([[ 475. , 605.75, 1. ], [ 571. , 586.5 , 2. ], [ 233. , 558.5 , 3. ], [ 669.5 , 562.75, 4. ], [ 291.25, 546.25, 5. ], [ 759. , 536.25, 6. ], [ 362.5 , 531.5 , 7. ], [ 448. , 513.5 , 8. ], [ 834.5 , 510. , 9. ], [ 897.25, 486. , 10. ], [ 545.5 , 491.25, 11. ], [ 214.5 , 481.25, 12. ], [ 271.25, 463. , 13. ], [ 646.5 , 466.75, 14. ], [ 739. , 442.75, 15. ], [ 340.5 , 441.5 , 16. ], [ 817.75, 421.5 , 17. ], [ 423.75, 417.75, 18. ], [ 202.5 , 406. , 19. ], [ 519.25, 392.25, 20. ], [ 257.5 , 382. , 21. ], [ 619.25, 368.5 , 22. ], [ 148. , 359.75, 23. ], [ 324.5 , 356. , 24. ], [ 713. , 347.75, 25. ], [ 195. , 335. , 26. ], [ 793.5 , 332.5 , 27. ], [ 403.75, 328. , 28. ], [ 249.25, 308. , 29. ], [ 495.5 , 300.75, 30. ], [ 314. , 279. , 31. ], [ 764.25, 249.5 , 32. ], [ 389.5 , 249.5 , 33. ], [ 475. , 221.5 , 34. ], [ 565.75, 199. , 35. ], [ 802.75, 173.75, 36. ], [ 733. , 176.25, 37. ]]) plt.figure() plt.gca().invert_yaxis() plt.plot(data[:,0], data[:,1], 'r.', markersize=14) idx in np.arange(data.shape[0]): plt.text(data[idx,0]+10, data[idx,1]+10, str(int(data[idx,2])), size='large') plt.show()
we get:
back question
as can see, these points more or less in grid pattern , can see can form lines between points. specifically, can see there lines can formed horizontally , vertically.
for example, if reference image in background section of problem, can see there 5 groups of points can grouped in horizontal manner. example, points 23, 26, 29, 31, 33, 34, 35, 37 , 36 form 1 group. points 19, 21, 24, 28, 30 , 32 form group , on , forth. in vertical sense, can see points 26, 19, 12 , 3 form 1 group, points 29, 21, 13 , 5 form group , on.
question ask
my question this: what method can group points in horizontal groupings , vertical groupings separately, given points in orientation?
conditions
there must @ least three points per line. if there less that, not qualify segment. therefore, points 36 , 10 don't qualify vertical line, , isolated point 23 shouldn't quality vertical line, part of first horizontal grouping.
the above calibration pattern can in orientation. however, i'm dealing with, worst kind of orientation can see above in background section.
expected output
the output pair of lists first list has elements each element gives sequence of point ids form horizontal line. similarly, second list has elements each element gives sequence of point ids form vertical line.
therefore, expected output horizontal sequences this:
matlab
horiz_list = {[23, 26, 29, 31, 33, 34, 35, 37, 36], [19, 21, 24, 28, 30, 32], ...}; vert_list = {[26, 19, 12, 3], [29, 21, 13, 5], ....};
python
horiz_list = [[23, 26, 29, 31, 33, 34, 35, 37, 36], [19, 21, 24, 28, 30, 32], ....] vert_list = [[26, 19, 12, 3], [29, 21, 13, 5], ...]
what have tried
algorithmically, have tried undo rotation experienced @ these points. i've performed principal components analysis , tried projecting points respect computed orthogonal basis vectors points more or less on straight rectangular grid.
once have that, it's simple matter of doing scanline processing group points based on differential change on either horizontal or vertical coordinates. you'd sort coordinates either x
or y
values, examine these sorted coordinates , large change. once encounter change, can group points in between changes form lines. doing respect each dimension give either horizontal or vertical groupings.
with regards pca, here's did in matlab , python:
matlab
%# step #1 - data - no ids data_raw = data(:,1:2); %# decentralize mean data_nomean = bsxfun(@minus, data_raw, mean(data_raw,1)); %# step #2 - determine covariance matrix %# decentralizes mean cov_data = cov(data_raw); %# step #3 - determine right singular vectors [~,~,v] = svd(cov_data); %# step #4 - transform data respect basis f = v.'*data_nomean.'; %# visualize both original data points , transformed data figure; plot(f(1,:), f(2,:), 'b.', 'markersize', 14); axis ij; hold on; plot(data(:,1), data(:,2), 'r.', 'markersize', 14);
python
import numpy np import numpy.linalg la # step #1 , step #2 - decentralize mean centroids_raw = data[:,:2] mean_data = np.mean(centroids_raw, axis=0) # transpose covariance calculation data_nomean = (centroids_raw - mean_data).t # step #3 - determine covariance matrix # doesn't matter if on decentralized result # or normal result - cov subtracts mean off anyway cov_data = np.cov(data_nomean) # step #4 - determine right singular vectors via svd # note - v^t, there's no need transpose _,_,v = la.svd(cov_data) # step #5 - transform data respect basis data_transform = np.dot(v, data_nomean).t plt.figure() plt.gca().invert_yaxis() plt.plot(data[:,0], data[:,1], 'b.', markersize=14) plt.plot(data_transform[:,0], data_transform[:,1], 'r.', markersize=14) plt.show()
the above code not reprojects data, plots both original points , projected points in single figure. however, when tried reprojecting data, plot get:
the points in red original image coordinates while points in blue reprojected onto basis vectors try , remove rotation. still doesn't quite job. there still orientation respect points if tried scanline algorithm, points lines below horizontal tracing or side vertical tracing inadvertently grouped , isn't correct.
perhaps i'm overthinking problem, insights have regarding appreciated. if answer indeed superb, inclined award high bounty i've been stuck on problem quite time.
i hope question wasn't long winded. if don't have idea of how solve this, thank time in reading question regardless.
looking forward insights may have. much!
note 1: has number of settings -> other images may need altered result want see % settings - play around these values
note 2: doesn't find of lines want -> starting point....
to call function, invoke in command prompt:
>> [h, v] = testlines;
we get:
>> celldisp(h) h{1} = 1 2 4 6 9 10 h{2} = 3 5 7 8 11 14 15 17 h{3} = 1 2 4 6 9 10 h{4} = 3 5 7 8 11 14 15 17 h{5} = 1 2 4 6 9 10 h{6} = 3 5 7 8 11 14 15 17 h{7} = 3 5 7 8 11 14 15 17 h{8} = 1 2 4 6 9 10 h{9} = 1 2 4 6 9 10 h{10} = 12 13 16 18 20 22 25 27 h{11} = 13 16 18 20 22 25 27 h{12} = 3 5 7 8 11 14 15 17 h{13} = 3 5 7 8 11 14 15 h{14} = 12 13 16 18 20 22 25 27 h{15} = 3 5 7 8 11 14 15 17 h{16} = 12 13 16 18 20 22 25 27 h{17} = 19 21 24 28 30 h{18} = 21 24 28 30 h{19} = 12 13 16 18 20 22 25 27 h{20} = 19 21 24 28 30 h{21} = 12 13 16 18 20 22 24 25 h{22} = 12 13 16 18 20 22 24 25 27 h{23} = 23 26 29 31 33 34 35 h{24} = 23 26 29 31 33 34 35 37 h{25} = 23 26 29 31 33 34 35 36 37 h{26} = 33 34 35 37 36 h{27} = 31 33 34 35 37 >> celldisp(v) v{1} = 33 28 18 8 1 v{2} = 34 30 20 11 2 v{3} = 26 19 12 3 v{4} = 35 22 14 4 v{5} = 29 21 13 5 v{6} = 25 15 6 v{7} = 31 24 16 7 v{8} = 37 32 27 17 9
a figure generated draws lines through each proper set of points:
function [horiz_list, vert_list] = testlines global counter; global colours; close all; data = [ 475. , 605.75, 1.; 571. , 586.5 , 2.; 233. , 558.5 , 3.; 669.5 , 562.75, 4.; 291.25, 546.25, 5.; 759. , 536.25, 6.; 362.5 , 531.5 , 7.; 448. , 513.5 , 8.; 834.5 , 510. , 9.; 897.25, 486. , 10.; 545.5 , 491.25, 11.; 214.5 , 481.25, 12.; 271.25, 463. , 13.; 646.5 , 466.75, 14.; 739. , 442.75, 15.; 340.5 , 441.5 , 16.; 817.75, 421.5 , 17.; 423.75, 417.75, 18.; 202.5 , 406. , 19.; 519.25, 392.25, 20.; 257.5 , 382. , 21.; 619.25, 368.5 , 22.; 148. , 359.75, 23.; 324.5 , 356. , 24.; 713. , 347.75, 25.; 195. , 335. , 26.; 793.5 , 332.5 , 27.; 403.75, 328. , 28.; 249.25, 308. , 29.; 495.5 , 300.75, 30.; 314. , 279. , 31.; 764.25, 249.5 , 32.; 389.5 , 249.5 , 33.; 475. , 221.5 , 34.; 565.75, 199. , 35.; 802.75, 173.75, 36.; 733. , 176.25, 37.]; figure; hold on; axis ij; % change due benoit_11 scatter(data(:,1), data(:,2),40, 'r.'); text(data(:,1)+10, data(:,2)+10, num2str(data(:,3))); text(data(:,1)+10, data(:,2)+10, num2str(data(:,3))); % process data above run function below(note has sub functions) counter = 0; colours = 'bgrcmy'; [horiz_list, vert_list] = findclosestpoints ( data(:,1), data(:,2) ); function [horiz_list, vert_list] = findclosestpoints ( x, y ) % calc length of points nx = length(x); % set place holder flags modelledh = false(nx,1); modelledv = false(nx,1); horiz_list = {}; vert_list = {}; % loop points p=1:nx % have modelled horizontal line through these? % second last param - true - horizontal, false - vertical if modelledh(p)==false [modelledh, index] = modelpoints ( p, x, y, modelledh, true, true ); horiz_list = [horiz_list index]; else [~, index] = modelpoints ( p, x, y, modelledh, true, false ); horiz_list = [horiz_list index]; end % make temp copy of x , y , remove of points modelled % horizontal -> avoid them being found in % second call. tempx = x; tempy = y; tempx(index) = nan; tempy(index) = nan; tempx(p) = x(p); tempy(p) = y(p); % have found vertial line? if modelledv(p)==false [modelledv, index] = modelpoints ( p, tempx, tempy, modelledv, false, true ); vert_list = [vert_list index]; end end end function [modelled, index] = modelpoints ( p, x, y, modelled, method, fullrun ) % p - row in original data matrix % x - data(:,1) % y - data(:,2) % modelled - array of flags whether rows have been modelled % method - horizontal or vertical (used calc graadients) % fullrun - full calc or indexes % made better storing indexes of each horizontal in method above % settings - play around these values graddelta = 0.2; % find points gradient less value gradlimit = 0.45; % if mean gradient of line above ignore numberofpointstocheck = 7; % number of points check when along line % find other points (this reduces chance of % finding other points far away % optimised example 7 % try varying , see how effect result. % find index of points inline. [index, grad] = calcindex ( x, y, p, graddelta, method ); % check gradient of line if abs(mean(grad))>gradlimit index = []; return end % add point of interest index index = [p index]; % loop through points found above find other points in % line these points (this allows slight curvature combineindex = []; ii=2:length(index) % find inedex of points found above (find points on curve) [index2] = calcindex ( x, y, index(ii), graddelta, method, numberofpointstocheck, grad(ii-1) ); % check point on line on original (i.e. inline -> not @ large angle if any(ismember(index,index2)) % store points found combineindex = unique([index2 combineindex]); end end % copy index index = combineindex; if fullrun % plotting % todo: here need calculate arrays output. xx = x(index); [sx,sorder] = sort(xx); % check found @ least 3 points if length ( index(sorder) ) > 2 % flag modelled on points found modelled(index(sorder)) = true; % plot data plot ( x(index(sorder)), y(index(sorder)), colours(mod(counter,numel(colours)) + 1)); counter = counter + 1; end index = index(sorder); end end function [index, gradcheck] = calcindex ( x, y, p, gradlimit, method, npoints2consider, refgrad ) % x - data(:,1) % y - data(:,2) % p - point of interest % method (x/y) or (y\x) % npoints2consider - @ n points (options) % refgrad - rather looking gradient of closest point -> use % - reference gradient find similar points (finds points on curve) nx = length(x); % calculate gradient g=1:nx if method grad(g) = (x(g)-x(p))\(y(g)-y(p)); else grad(g) = (y(g)-y(p))\(x(g)-x(p)); end end % find distance other points delta = sqrt ( (x-x(p)).^2 + (y-y(p)).^2 ); % set self = nan delta(delta==min(delta)) = nan; % find closest points [m,order] = sort(delta); if nargin == 7 % finding along curve % set far away points nan grad(order(npoints2consider+1:end)) = nan; % find closest points reference gradient within allowable limit index = find(abs(grad-refgrad)<gradlimit==1); % store output gradcheck = grad(index); else % find points closes gradient of closest point index = find(abs(grad-grad(order(1)))<gradlimit==1); % store gradients output gradcheck = grad(index); end end end