Friday 30 June 2017

One Month In

Now one month of GSoC has passed and so far everything has gone much better than I expected! According to my timeline this week would have been the first of two were I work on vectorization. Instead I have already mostly finished the vectorization and have started to work on other things. In this blog post I'll give a summary of what work I have completed and what I have left to do. I'll structure it according to where the functions are listed in the $INDEX$-file [1]. The number after the heading is the number of functions in that category.

Since this will mainly be a list of which files have been modified and which are left to do this might not be very interesting if you are not familiar with the structure of the interval package.

Interval constant (3)

All of these have been modified to support N-dimensional arrays.

Interval constructor (5)

All of these have been modified to support N-dimensional arrays.

Interval function (most with tightest accuracy) (63)

Almost all of these functions worked out of the box! At least after the API functions to the MPFR and crlibm libraries were fixed, they are further down in the list.

The only function that did not work immediately were $linspace$. Even though this function could be generalized to N-dimensinal arrays the standard Octave function only works for matrices (I think the Matlab version only allows scalars). This means that adding support for N-dimensional vectors for the interval version is not a priority. I might do it later on but it is not necessary.

Interval matrix operation (16)

Most of the matrix functions does not make sense for N-dimensional arrays. For example matrix multiplication and matrix inversion only makes sense for matrices. However all of the reduction functions are also here, they include $dot$, $prod$, $sum$, $sumabs$ and $sumsq$.

At the moment I have implemented support for N-dimensional arrays for $sum$, $sumabs$ and $prod$. The functions $dot$ and $sumsq$ are not ready, I'm waiting to see what happens with bug #51333 [2] before I continue with that work. Depending on the bug I might also have to modify the behaviour of $sum$, $sumabs$ and $prod$ slightly.

Interval comparison (19)

All of these have been modified to support N-dimensional arrays.

Set operation (7)

All of these functions have been modified to support N-dimensional arrays except one, $mince$. The function $mince$ is an interval version of $linspace$ and reasoning here is the same as that for $linspace$ above.

Interval reverse operation (12)

Like the interval functions above, all of the functions worked out of the box!

Interval numeric function (11)

Also these functions worked out of the box, with some small modifications to the documentation for some of them.

Interval input and output (9)

Here there are some functions which require some comments, the ones I do not comment about have all gotten support for N-dimensional arrays.

$interval\_bitpack$
I think that this function does not make sense to generalize to N-dimensions. It could perhaps take an N-dimensional arrays as input, but it will always return a row vector.  I have left it as it is for now at least.

$disp$ and $display$
These are functions that might be subject to change later on. At the moment it prints N-dimensional arrays of intervals in the same way Octave does for normal arrays. It's however not clear how to handle the $\subset$ symbol and we might decide to change it.

Interval solver or optimizer (5)

The functions $gauss$ and $polyval$ are not generalizable to N-dimensional vectors. I don't think that $fzero$ can be generalized either, for it to work the functions must be real-valued.

The function $fsolve$ can perhaps be modified to support N-dimensional vectors. It uses the SIVIA algorithm [3] and I have to dive deeper into how it works to see if it can be done.

For $fminsearch$ nothing needed to be done, it worked for N-dimensional arrays directly.

Interval contractor arithmetic (2)

Both of these functions are used together with $fsolve$ so they also depend on if SIVIA can be generalized or not.

Verified solver or optimizer (6)

All of these functions work on matrices and cannot be generalized.

Utility function (29)

All of these for which it made sense have been modified to support N-dimensional arrays. Some of them only works for matrices, these are $ctranspose$, $diag$, $transpose$, $tril$ and $triu$. I have left them as they were, though I fixed a bug in $diag$.

API function to the MPFR and crlibm libraries (8)

These are the functions that in general required most work. The ones I have added full support for N-dimensional arrays in are $crlibm\_function$, $mpfr\_function\_d$ and $mpfr\_vector\_sum\_d$. Some of them cannot be generalized, these are $mpfr\_matrix\_mul_d$, $mpfr\_matrix\_sqr\_d$ and $mpfr\_to\_string\_d$. The functions $mpfr\_linspace\_d$ and $mpfr\_vector\_dot\_d$ are related to what I mentioned above for $linspace$ and $dot$.

Summary

So summing up the functions that still require some work are
  • Functions related to $fsolve$
  • The functions $dot$ and $sumsq$
  • The functions $linspace$ and $mince$
Especially the functions related to $fsolve$ might take some time to handle. My goal is to dive deeper into this next week.

Apart from this there are also some more things that needs to be considered. The documentation for the package will need to be updated. This includes adding some examples which make use of the new functionality.

The interval package also did not follow the coding style for Octave. All the functions which I have made changes to have been updated with the correct coding style, but many of the functions that worked out of the box still use the old style. It might be that we want to unify the coding standard for all files before the next release.

[1] The $INDEX$ file https://sourceforge.net/u/urathai/octave/ci/default/tree/INDEX
[2] Bug #51333 https://savannah.gnu.org/bugs/index.php?51333
[3] The SIVIA algorithm https://www.youtube.com/watch?v=kxYh2cXHdNQ

Thursday 22 June 2017

Vectorization and broadcasting

At the moment I'm actually ahead of my schedule and this week I started to work on support for vectorization on N-dimensional arrays. The by far biggest challenge was to implement proper broadcasting and most of this post will be devoted to going through that. At the end I also mention some of the other things I have done during the week.

Broadcasting arrays

At the moment I have implement support for broadcasting on all binary functions. Since all binary functions behave similarly in respect to broadcasting I will use $+$ in all my example below, but this could in principle be any binary function working on intervals.

When adding to arrays, $A, B$, of the same size the result is just an arrays of the same size with each entry containing the sum of the corresponding entries in $A$ and $B$. If $A$ and $B$ does not have the same size then we try to perform broadcasting. The simplest form of broadcasting is when $A$ is an arrays and $B$ is a scalar. Then we just take the value of $B$ and add to every element in $A$. For example

> A = infsupdec ([1, 2; 3, 4])
A = 2×2 interval matrix
   [1]_com   [2]_com
   [3]_com   [4]_com
> B = infsupdec (5)
B = [5]_com
> A + B
ans = 2×2 interval matrix
   [6]_com   [7]_com
   [8]_com   [9]_com

However it is not only when one of the inputs is a scalar that broadcasting can be performed. Broadcasting is performed separately for each dimension of the input. We require either that the dimensions are equal, and no broadcasting is performed, or that one of the inputs have that dimension equal to $1$, we then concatenate this input along that dimension until they are of equal size. If for example $A$ has dimension $4\times4\times4$ and $B$ dimension $4\times4\times1$ we concatenate $B$ with itself along the third dimension four times to get two arrays of the same size. Since a scalar has all dimensions equal to 1 we see that it can be broadcasted to any size. Both $A$ and $B$ can also be broadcasted at the same time, along different dimensions, for example

> A = infsupdec (ones (1, 5, 2))
A = 1×5×2 interval array
ans(:,:,1) =
   [1]_com   [1]_com   [1]_com   [1]_com   [1]_com
ans(:,:,2) =
   [1]_com   [1]_com   [1]_com   [1]_com   [1]_com
> B = infsupdec ([1, 2, 3, 4, 5; 6, 7, 8, 9, 10])
B = 2×5 interval matrix
   [1]_com   [2]_com   [3]_com   [4]_com    [5]_com
   [6]_com   [7]_com   [8]_com   [9]_com   [10]_com
> A + B
ans = 2×5×2 interval array
ans(:,:,1) =
   [2]_com   [3]_com   [4]_com    [5]_com    [6]_com
   [7]_com   [8]_com   [9]_com   [10]_com   [11]_com
ans(:,:,2) =
   [2]_com   [3]_com   [4]_com    [5]_com    [6]_com
   [7]_com   [8]_com   [9]_com   [10]_com   [11]_com

The implementation

I'll go through a little bit about my implementation. I warn you that I'm not that familiar with the internals of Octave so some things I say might be wrong, or at least not totally correct.

Internally all, numerical, arrays are stored as a linear vector and the dimensions are only metadata. This means that the most efficient way to walk through an array is with a linearly increasing index. When $A$ and $B$ have the same size the most efficient way to sum them is to linearly go through the arrays. In pseudo code

// Calculate C = A + B
for (int i = 0; i < numel (A); i++) {
  C(i) = A(i) + B(i);
}

This works fine, and apart from unrolling the loop or doing optimizations like that it is probably the most efficient way to do it.

If $A$ and $B$ are not of the same size then one way to do it would be to simply extend $A$ or/and $B$ along the needed dimensions. This would however require coping a lot of data, something we want to avoid (memory access is expensive). Instead we try to be smart with our indexing to access the right data from both $A$ and $B$.

After asking on the IRC-channel I got pointed to this Octave function which performs broadcasting. My implementation, which can be found here, is heavily inspired by that function.

Performance

Here I compare the performance of the new implementation with the old one. Since the old one could only handle matrices we are limited by that. We can measure the time it takes to add two matrices $A$, $B$ with the code

tic; A + B; toc;

We do 10 runs for each test and all times are in seconds.

Addition of large matrices

Case 1: A = B = infsupdec (ones (1000, 1000));
       Old         New
       0.324722    0.277179
       0.320914    0.276116
       0.322018    0.276075
       0.318713    0.279258
       0.332041    0.279593
       0.318429    0.279987
       0.323752    0.279089
       0.317823    0.276036
       0.320509    0.280964
       0.320610    0.281123
Mean:  0.32195     0.27854
Case 2: A = B = infsupdec (ones (10, 100000));
        Old         New
        0.299321    0.272691
        0.297020    0.282591
        0.296460    0.274298
        0.294541    0.279661
        0.298306    0.277274
        0.301532    0.275531
        0.298163    0.278576
        0.298954    0.279868
        0.302849    0.275991
        0.297765    0.278806
Mean:   0.29849    0.27753

Case 3: A = B = infsupdec (ones (100000, 10));
        Old         New
        0.286433    0.279107
        0.289503    0.278251
        0.297562    0.279579
        0.292759    0.283311
        0.292983    0.281306
        0.290947    0.282310
        0.293025    0.286172
        0.294153    0.278886
        0.293457    0.278625
        0.296661    0.280804
Mean:   0.29275     0.28084

Broadcasting scalars

Case 4: A = infsupdec (ones (1000, 1000));
             B = infsupdec (1);
        Old         New
        0.298695    0.292419
        0.298158    0.292274
        0.305242    0.296036
        0.295867    0.291311
        0.296971    0.297255
        0.304297    0.292871
        0.298172    0.300329
        0.297251    0.291668
        0.299236    0.294128
        0.300457    0.298005
Mean;   0.29943     0.29463

Case 5: A = infsupdec (1);
             B = infsupdec (ones (1000, 1000));
         Old         New
        0.317276    0.291100
        0.316858    0.296519
        0.316617    0.292958
        0.316159    0.299662
        0.317939    0.301558
        0.322162    0.295338
        0.321277    0.293561
        0.314640    0.291500
        0.317211    0.295487
        0.317177    0.294376
Mean:   0.31773     0.29521

Broadcasting vectors

Case 6: A = infsupdec (ones (1000, 1000));
             B = infsupdec (ones (1000, 1));
        Old         New
        0.299546    0.284229
        0.301177    0.284458
        0.300725    0.276269
        0.299368    0.276957
        0.303953    0.278034
        0.300894    0.275058
        0.301776    0.276692
        0.302462    0.282946
        0.304010    0.275573
        0.301196    0.273109
Mean:   0.30151     0.27833

Case 7: A = infsupdec (ones (1000, 1000));
             B = infsupdec (ones (1, 1000));
         Old         New
        0.300554    0.295892
        0.301361    0.294287
        0.302575    0.299116
        0.304808    0.294184
        0.306700    0.291606
        0.301233    0.298059
        0.301591    0.292777
        0.302998    0.290288
        0.300452    0.291975
        0.305531    0.290178
Mean:   0.30278     0.29384

We see that in all cases the new version is faster or at least equally fast as the old version. In the old version the order of the input made a slight difference in performance (case 4 vs case 5). In the new version both inputs are treated in exactly the same way so we no longer see that difference.

Possible improvements

In theory the cases when we broadcast a scalar could be the fastest ones. If $B$ is a scalar we could, in pseudo code, do something similar to
// Calculate C = A + B with B scalar
for (int i = 0; i < numel (A); i++) {
  C(i) = A(i) + B;
}

This is however not implemented at the moment. Instead we use the ordinary routine to calculate the index for $B$ (since it is a scalar it will always evaluate to $1$). If we would like to optimize more for this case we could add a check for if $A$ or $B$ are scalars and then optimize for that. Of course this would also make the code more complicated, something to watch out for. At the moment I leave it like this but if we later want to optimize for that case it could be done.

Other work

Apart from the work to fix the broadcasting for binary functions there were very little to do for many of the functions. All binary functions that use this code, and all unary functions using an even simpler code, worked directly after fixing the oct-files. Some of them required small changes to the documentation but other than that the octave-scripts were fine. So mainly it has been a matter of actually going through all files and check that they actually did work.

Bug #51283

When going through all the functions I noticed a bug in the interval version of $\sin$,

 > sin (infsupdec (0))
ans = [0]_com
> sin (infsupdec ([0, 0]))
ans = 1×2 interval vector
   [0, -0]_com   [0, -0]_com

The second version here is wrong, $-0$ should never be allowed as a value for the supremum of an interval. I was able to track this down to how Octaves $\max$ function works, see bug #51283. As Oliver writes there the exact behaviour of the $\max$-function is not specified in IEEE Std 754-2008 so we cannot rely on that. To solve this I have added a line to manually set all $-0$ to $+0$ in the supremum of the interval.

 

Friday 16 June 2017

Construction and Printing

This week I have started to work on methods for constructing and printing N-dimensional arrays of intervals. In my timeline I estimated that this work would take 2 weeks. However in this first week I have managed to complete most of the work. I will give some comments on how I have worked with the Mercurial repository, how the work went and different things I encountered along the path.

Working with Mercurial

This is essentially the first time I'm using Mercurial for revision control, though I have used git before. However I quickly learned how to use it for the basic tasks that I need, committing, comparing files and checking the history. As mentioned in a previous post you can find my repository here [1].

Coding style

When I started to work with the files I realized that they did not follow Octaves coding standard [2]. After a short discussion on the mailing list we decided that I will update the files I change to follow the standard coding style. Usually it is not a good idea to change coding style and add functionality in the same commit. However most of the changes to coding style are only white space changes so they can be ignored using the -w flag in Mercurial. Thus we decided that as long as the coding style changes are only such that it is ignored with -w I will do it in the same commit as the added functionality. If there are some coding style changes that's not only white space, the most common example is to long lines, I do a commit with only changes to the coding style first. So if you want to take a look at the functionality I have added you will probably want to use the -w flag. Note however that I have not updated the coding style for any files I have not changed otherwise.

Committing

Normally I do one commit for each file, though in many cases the bare intervals and the decorated intervals have almost identical functions and in that case I commit changes to them both at the same time. Of course it also happens that I have to go back and do more changes to a files, in that case I just do another commit.

The actual work

The work went much faster than I expected. The main reason for this is that Octave has very good support for indexing. For example expressions like

isnai(x.inf <= x.sup) = false;

works just as well for matrices as for N-dimensional arrays. In fact the constructor for bare intervals even worked for N-dimensional arrays from the beginning, there I only had to do slight modification to the documentation and add some tests!

Not all functions were that easy though. Some functions that have not been updated in a while clearly assumed the input was a matrix, for example in $hull$

sizes1 = cellfun ("size", l, 1);
sizes2 = cellfun ("size", l, 2);


In most cases I only needed to add more general indexing, often times even making the code clearer.

In some functions all I had to do was to remove the check on the input data so that it would accept N-dimensional arrays. This was true in for example $cat$ were all I had to do was to remove the check and do some minor modifications to the documentation.

I can conclude with saying that Octave has great support for working with N-dimensional arrays. Since internally the data for intervals are stored only as arrays I could make good use of it!

Noteworthy things

While most functions were straight forward to modify some required some thought. How should they even work for N-dimensional input?

Disp

When modifying the $disp$-function I chose to mimic how Octave handles displaying N-dimensional arrays. I noticed that this is different from how Matlab handles it. In Matlab we have

> x = zeros (2, 2, 2)

x(:,:,1) =

     0     0
     0     0


x(:,:,2) =

     0     0
     0     0


while in Octave it's

> x = zeros (2, 2, 2)
x =

ans(:,:,1) =

   0   0
   0   0

ans(:,:,2) =

   0   0
   0   0


I don't know the choice behind Octaves version. At least at first glance I think I prefer the way Matlab does it. But since I'm working in Octave I chose that style.

The next question was how to handle the subset symbol, $\subset$. The interval package uses $=$ or $\subset$ depending on if the string representation is exact or not. For example

> x = infsup (1/2048, 1 + 1/2048);
> format short; x
x ⊂ [0.00048828, 1.0005]
> format long; x
x = [0.00048828125, 1.00048828125]

How should this be handled for N-dimensional arrays? One way would be to switch all $=$ to $\subset$ is the representation is not exact. Another to use $\subset$ on all submatrices that does not have an exact string representation. The third way, and how it is implemented now, is to only change the first $=$ to $\subset$, the one after the variable name. Like this

> x(1,1,1:2) = infsup (1/2048, 1 + 1/2048)
x ⊂ 1×1×2 interval array

ans(:,:,1) =   [0.00048828, 1.0005]
ans(:,:,2) =   [0.00048828, 1.0005]


This might be a bit odd when you first look at it, on some places we use $=$ and on some $\subset$. Though I think it somehow makes sense, we are saying that $x$ is a subset of the $1\times1\times2$ interval array given by

ans(:,:,1) =   [0.00048828, 1.0005]
ans(:,:,2) =   [0.00048828, 1.0005]

which actually is true. Anyway I will leave like this for now and then we might decide to switch it up later.

linspace and mince

The standard implementation of $linspace$ only supports scalar or vector input. It could be generalized to N-dimensional arrays by for example returning a N+1-dimensional array were the last dimension corresponds to the linearly spaced elements. But since this has not been done in the standard implementation I will at least wait with adding for intervals.

The function $mince$ can be seen as a interval generalization of $linspace$. It  takes an interval and returns an array of intervals whose union cover it. This could similarly be expanded to N dimensions by creating the array along the N+1 dimension. But again we choose to at least wait with adding this.

meshgrid and ndgrid

The interval package already has an implementation of $meshgrid$. But since it previously did not support 3-dimensional arrays it had to fit 3-d data in a 2-d matrix. Now that it supports 3-d data it can output that instead.

Currently the interval package does not implement $ndgrid$. When I looked into it I realized that the standard implementation of $ndgrid$ actually works for interval arrays as well. I have not looked into the internals but in principle it should only need the $cat$ function, which is implemented for intervals. Further I noticed that the standard $meshgrid$ also works for intervals. However the interval implementation differs in that it converts all input to intervals, were as the standard implementation allows for non-uniform output. Using the interval implementation of $meshgrid$ we have

> [X Y] = meshgrid (infsup (1:3), 4:6)
X = 3×3 interval matrix

   [1]   [2]   [3]
   [1]   [2]   [3]
   [1]   [2]   [3]

Y = 3×3 interval matrix

   [4]   [4]   [4]
   [5]   [5]   [5]
   [6]   [6]   [6]

but if we fall back to the standard implementation (by removing the interval implementation) we get

> [X Y] = meshgrid (infsup (1:3), 4:6)
X = 3×3 interval matrix

   [1]   [2]   [3]
   [1]   [2]   [3]
   [1]   [2]   [3]

Y =

   4   4   4
   5   5   5
   6   6   6

Note that the last matrix is not an interval matrix.

So the question is, should we implement a version of $ndgrid$ that converts everything to intervals or should we remove the implementation of $meshgrid$? It's at least most likely not a good idea that the functions are different. I think that removing the implementation of $meshgrid$ makes most sense. First of all it's less code to maintain, which is always nice. Secondly you can manually convert all input to the function to intervals if you want uniform output. If you do not want uniform output then the standard implementation works were as the interval implementation does not, so the standard implementation is more general in a sense.

We have to choose what to do, but for now I leave it as it is.

Non-generalizable functions

From what I have found there is no way to create a 3-dimensional array in Octave in the same way you can create a 2-dimensional one with for example

M = [1, 2; 3, 4];

Instead higher dimensional arrays have to be created using other functions, for example $reshape$ or $zeros$, or by specifying the submatrices directly

M(:,:,1) = [1, 2; 3, 4];
M(:,:,2) = [5, 6; 7, 8];

This means that the functions $\_\_split\_interval\_literals\_\_$, which is used to split a string like $"[1, 2; 3, 4]"$ into its separate components, cannot really be generalized to N dimensions.


[1] https://sourceforge.net/u/urathai/octave/ci/default/tree/
[2] http://wiki.octave.org/Octave_style_guide