Sunday, December 26, 2021

A small programming exercise in APL #2: Combinations

In this post I'm going to continue my APL exploration by working in a possible solution for the problem of generating all the combinations of 'n' elements of an array.

Generating the 'n' combinations of an array means generating a sequence of all possible 'n' unique elements from the original array . For example, given the array 12, 34, 35, 65 all possible '2' combinations of this array are:

  • 12, 34
  • 12, 35
  • 12, 65
  • 34, 35
  • 34, 65
  • 35, 65

Notice that order is not important. That is "12, 34" is considered to be the same as "34, 12". Generating all 'n' permutations of the elements of an array may be an interesting problem for a future post.

One of my goals was to be as idiomatic as possible (with my limited APL knowledge!). Because of this, I will try avoid using explicit loops or conditionals and instead use array operators.

Strategy

The general strategy for solving this problem was to calculate all possible boolean arrays having 'n' bits and use Compress to extract the elements.

For example, in the array 12, 34, 35, 65 all possible boolean vectors having two bits on are:

  • 1 1 0 0
  • 1 0 1 0
  • 1 0 0 1
  • 0 1 1 0
  • 0 1 0 1
  • 0 0 1 1
Using these vectors in APL we get the elements we need:

      a
12 34 35 65
      1 1 0 0 / a
12 34
      1 0 1 0 / a
12 35
      1 0 0 1 / a
12 65
      0 1 1 0 / a
34 35
      0 1 0 1 / a
34 65
      0 0 1 1 / a
35 65

This strategy is not efficient in space or time but it allowed me to explore a solution without using imperative constructs or recursion.

Generating the boolean arrays

To generate the boolean arrays we start by generating all integer values required to encode n-bit values:


      a ← 12 34 35 65
      ⍳ ( ¯1 + 2 * ⍴ a)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

When converting these numbers to binary we get all possible boolean vectors for 4-bit values. This is same as the same as the length of our sample array. We can see these arrays if we use Encode and Each:


     { ((⍴ a) ⍴ 2) ⊤ ⍵ } ¨ ⍳ ( ¯1 + 2 * ⍴ a)
 0 0 0 1  0 0 1 0  0 0 1 1  0 1 0 0  0 1 0 1  0 1 1 0  0 1 1 1  1 0 0 0\
  1 0 0 1  1 0 1 0  1 0 1 1  1 1 0 0  1 1 0 1  1 1 1 0  1 1 1 1

I can reshape this array to get a better representation:


      15 1 ⍴ { ((⍴ a) ⍴ 2) ⊤ ⍵ } ¨ ⍳ ( ¯1 + 2 * ⍴ a)
 0 0 0 1
 0 0 1 0
 0 0 1 1
 0 1 0 0
 0 1 0 1
 0 1 1 0
 0 1 1 1
 1 0 0 0
 1 0 0 1
 1 0 1 0
 1 0 1 1
 1 1 0 0
 1 1 0 1
 1 1 1 0
 1 1 1 1

Now we are only interested in boolean arrays with 'n' number of '1' bits. For n = 2 we can get only those elements using the Compress operator:


      seq ← ⍳ ( ¯1 + 2 * ⍴ a)
      n ← 2
      ({ n =  +/ ((⍴ a) ⍴ 2) ⊤ ⍵} ¨ seq) / seq
3 5 6 9 10 12

Again we can visualize these values using reshape and encode:


      6 1 ⍴ { ((⍴ a) ⍴ 2) ⊤ ⍵ } ¨ ({ n =  +/ ((⍴ a) ⍴ 2) ⊤ ⍵} ¨ seq) / seq
 0 0 1 1
 0 1 0 1
 0 1 1 0
 1 0 0 1
 1 0 1 0
 1 1 0 0
 

Selecting desired elements

Now that we have our boolean arrays we can pick all the values using Compress:


     {(((⍴ a) ⍴ 2) ⊤ ⍵)/a} ¨ ({ n =  +/ ((⍴ a) ⍴ 2) ⊤ ⍵} ¨ seq) / seq
 35 65  34 65  34 35  12 65  12 35  12 34

And again we can reshape this array to see it better:


      6 1 ⍴ {(((⍴ a) ⍴ 2) ⊤ ⍵)/a} ¨ ({ n =  +/ ((⍴ a) ⍴ 2) ⊤ ⍵} ¨ seq) / seq
 35 65
 34 65
 34 35
 12 65
 12 35
 12 34

Finally we can pack these expressions in a function:


∇r←a ncombinations1 n;seq
  seq ← ⍳ ( ¯1 + 2 * ⍴ a)
  r ← {(((⍴ a) ⍴ 2) ⊤ ⍵)/a} ¨ (({ n =  +/ ((⍴ a) ⍴ 2) ⊤ ⍵} ¨ seq) / seq\
)
∇

We can use it like this:


      1 2 3 4 5 ncombinations1 3
 3 4 5  2 4 5  2 3 5  2 3 4  1 4 5  1 3 5  1 3 4  1 2 5  1 2 4  1 2 3

APL has a build-in Binomal operator which allows us to calculate the number of combinations. For example:


      a ← 1 2 3 4 5
      (3 ! ⍴ a) = ⍴ a ncombinations3 3
1

Final words

It was very interesting (and difficult) to try to write a solution of this problem using only array operations. Reading the definion of the ncombinations1 I noticed that there are many expressions nested in parenthesis (maybe related to my limited APL knowledge). I know that APL developers value terseness, so I tried to reduce the size of the expressions. Here is a new version:


∇r←a ncombinations3 n;seq
  s ← ⍳ ( ¯1 + 2 * ⍴ a)
  b ← 2⍴⍨⍴a
  r ← {a/⍨b⊤⍵}¨s/⍨{n=+/b⊤⍵}¨s
∇

I was able to remove a couple of parenthesis by taking advantage of the Commute operator.

GNU APL was used to create the examples in this post.