----------------------------------------
ColumnA | ColumnB | ColumnC |
----------------------------------------
Cat | Shirt | Pencil |
Dog | Shirt | Eraser |
Worm | Dress | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
With the example data above I am trying to find the most re-occuring combinations which are a pair of 2 or greater.
For example
Shirt,Pen 6
Cat,Pen 6
Cat,Shirt 4
Jacket, Pen 3
Pen,Cow 3
Cat,Shirt,Pen 3
Cat,Jacket,Pen 3
Cow,Shirt,Pen 3
I need this for up to 10 columns of data.
Cat,Shirt
is the same as Shirt,Cat
.
What is the best algorithm to use? Preferably in SQL but I could also try PHP?