Performance of not

Optimizations for not with special index types

For faster indexing, this package provides optimizations for not with some special index types, which means not(x) is not equivalent to FI(!in(x)) for x belonging to those index types, and will be converted to different index types by to_indices function.

There are two list of those index types, both of which are enabled for not, but for customized NotIndex types, optimizations in second list is enabled when indextype returns Vector{Int}.

There optimizations are enabled for any AbstractNotIndex types by default:

  • x::Colon will be converted to an empty Int array: Int[];
  • x::AbstractArray{Bool} will be converted to a LogicalIndex with mask mappedarray(!, x);
  • x::AbstractArray will be converted like FI(!in(x′)), while x′ is a Set like array converted from x with faster in;
  • For I′, J′ = to_indices(A, (not(I), not(J))), not(I′) and not(I′) will revert to I, J.

There optimizations are enabled only if indextype is defined as Vector{Int}:

  • x::Integer or x::OrdinalRange{<:Integer} will be converted to an Int array where x is removed from given axis.
  • x::Base.Slice will be converted to an empty Int array, when the slice represents the given axis, Otherwise, it will be treated as a normal AbstractUnitRange.

Performance tips for not

For small array, the optimized not(x) might be slower in some case, because of the overhead for creating a Set, see below.

There are some tips for better performance:

  • Use FI(!in(x)) instead of not(x).
  • Create your own "Not" type, see below example for details.
  • For a small array of indices, not(1, 2, 3) will faster than not([1, 2, 3]).

Performance comparing

This is a performance comparing of "Inverted Indexing" for different methods. There are five methods:

  • bynot: by not of this package;
  • byfi: by function index FI(!in(I));
  • bymap: by logical indices which test map(!in(I), axis);
  • byfilter: by removing I from axes by filter;
  • byNot: by InvertedIndices.Not.

The minimum time and allocation of each method for different type of index were compared:

using BenchmarkTools
using InvertedIndices
using FunctionIndices
using Latexify

# Linear
bynot(A, I) = A[not(I)]
byfi(A, I) = A[FI(!in(I))]
bymap(A, I) = A[map(!in(I), begin:end)]
byfilter(A, I) = A[filter(!in(I), begin:end)]
byNot(A, I) = A[Not(I)]

# Cartesian
bynot(A, Is...) = A[ntuple(i -> not(Is[i]), Val(length(Is)))...]
byfi(A, Is...) = A[ntuple(i -> FI(!in(Is[i])), Val(length(Is)))...]
bymap(A, Is...) = A[ntuple(i -> map(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byfilter(A, Is...) = A[ntuple(i -> filter(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byNot(A, Is...) = A[ntuple(i -> Not(Is[i]), Val(length(Is)))...]

const As = (rand(10), rand(10, 10), rand(10, 10, 10))
const fs = (bynot, byfi, bymap, byfilter, byNot)

# convert tuple of tuple to matrix
tt2mt(trs) = hcat(map(tr -> vcat(tr...), trs)...)

maketable(bench, As::Tuple=As, fs::Tuple=fs) = mdtable(
    # use map of map instead of for loop for typle stable
    map(
        f -> map(
            A -> begin
                trial = bench(f, A)
                trialmin = minimum(trial)
                trialallocs = allocs(trialmin)
                string(
                    BenchmarkTools.prettytime(time(trialmin)),
                    " (", trialallocs , " alloc",
                    trialallocs == 1 ? "" : "s", ": ",
                    BenchmarkTools.prettymemory(memory(trialmin)), ")"
                )
            end,
            As
        ),
        fs
    ) |> tt2mt;
    head=fs, side=[:Size; size.(As)...], latex=false,
);
maketable (generic function with 3 methods)

Indexing with Int

A random inbounds Int index.

Linear Indexing

maketable() do f, A
    ind = rand(firstindex(A):lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)100.213 ns (2 allocs: 256 bytes)54.776 ns (1 alloc: 128 bytes)96.321 ns (2 allocs: 192 bytes)166.581 ns (3 allocs: 400 bytes)2.000 μs (53 allocs: 1.72 KiB)
(10, 10)208.417 ns (2 allocs: 1.75 KiB)190.324 ns (1 alloc: 896 bytes)238.715 ns (2 allocs: 1.03 KiB)362.500 ns (3 allocs: 2.62 KiB)20.600 μs (526 allocs: 17.25 KiB)
(10, 10, 10)1.720 μs (2 allocs: 15.88 KiB)1.710 μs (1 alloc: 7.94 KiB)1.620 μs (2 allocs: 9.00 KiB)3.000 μs (3 allocs: 23.81 KiB)213.801 μs (6265 allocs: 188.38 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(i -> rand(axes(A, i)), Val(ndims(A)))
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)100.852 ns (2 allocs: 256 bytes)55.533 ns (1 alloc: 128 bytes)95.789 ns (2 allocs: 192 bytes)165.207 ns (3 allocs: 400 bytes)1.880 μs (47 allocs: 1.53 KiB)
(10, 10)262.353 ns (3 allocs: 992 bytes)220.751 ns (1 alloc: 736 bytes)302.745 ns (3 allocs: 864 bytes)392.118 ns (5 allocs: 1.25 KiB)19.400 μs (504 allocs: 16.12 KiB)
(10, 10, 10)1.933 μs (11 allocs: 6.41 KiB)2.367 μs (8 allocs: 6.03 KiB)2.667 μs (8 allocs: 6.19 KiB)2.111 μs (11 allocs: 6.80 KiB)168.901 μs (4216 allocs: 134.75 KiB)

Indexing with UnitRange

A random inbound UnitRange with half-length of axe.

Linear Indexing

maketable() do f, A
    axe = firstindex(A):lastindex(A)
    half = length(axe) ÷ 2
    b = rand(axe[begin:(end - half)])
    e = b + half
    @benchmark $f($A, $(b:e))
end
SizebynotbyfibymapbyfilterbyNot
(10,)91.527 ns (2 allocs: 192 bytes)60.081 ns (1 alloc: 96 bytes)99.048 ns (2 allocs: 160 bytes)151.412 ns (3 allocs: 336 bytes)929.032 ns (23 allocs: 848 bytes)
(10, 10)139.026 ns (2 allocs: 896 bytes)163.861 ns (1 alloc: 448 bytes)225.375 ns (2 allocs: 608 bytes)320.874 ns (3 allocs: 1.75 KiB)10.500 μs (283 allocs: 10.94 KiB)
(10, 10, 10)1.130 μs (2 allocs: 8.12 KiB)1.360 μs (1 alloc: 4.06 KiB)1.470 μs (2 allocs: 5.12 KiB)2.678 μs (3 allocs: 16.06 KiB)107.300 μs (3163 allocs: 113.22 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            half = length(axe) ÷ 2
            b = rand(axe[begin:(end - half)])
            e = b + half
            b:e
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)91.424 ns (2 allocs: 192 bytes)61.672 ns (1 alloc: 96 bytes)100.740 ns (2 allocs: 160 bytes)151.296 ns (3 allocs: 336 bytes)955.556 ns (24 allocs: 928 bytes)
(10, 10)169.735 ns (3 allocs: 384 bytes)143.148 ns (1 alloc: 192 bytes)201.311 ns (3 allocs: 320 bytes)290.146 ns (5 allocs: 672 bytes)4.629 μs (116 allocs: 4.25 KiB)
(10, 10, 10)943.478 ns (14 allocs: 1.23 KiB)1.040 μs (11 allocs: 976 bytes)1.010 μs (11 allocs: 1.02 KiB)1.070 μs (14 allocs: 1.53 KiB)19.300 μs (464 allocs: 15.22 KiB)

Indexing with StepRange

A StepRange with step 2.

Linear Indexing

maketable() do f, A
    ind = firstindex(A):2:lastindex(A)
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)137.748 ns (2 allocs: 192 bytes)248.551 ns (1 alloc: 96 bytes)182.921 ns (2 allocs: 160 bytes)258.092 ns (3 allocs: 336 bytes)1.170 μs (31 allocs: 1.31 KiB)
(10, 10)528.947 ns (2 allocs: 992 bytes)2.233 μs (1 alloc: 496 bytes)1.220 μs (2 allocs: 656 bytes)1.430 μs (3 allocs: 1.84 KiB)10.901 μs (301 allocs: 12.95 KiB)
(10, 10, 10)4.643 μs (2 allocs: 8.12 KiB)22.100 μs (1 alloc: 4.06 KiB)11.700 μs (2 allocs: 5.12 KiB)13.300 μs (3 allocs: 16.06 KiB)110.200 μs (3491 allocs: 136.69 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            axe[begin:2:end]
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)139.005 ns (2 allocs: 192 bytes)247.343 ns (1 alloc: 96 bytes)183.796 ns (2 allocs: 160 bytes)257.647 ns (3 allocs: 336 bytes)1.190 μs (31 allocs: 1.31 KiB)
(10, 10)278.749 ns (3 allocs: 448 bytes)923.529 ns (1 alloc: 256 bytes)406.000 ns (3 allocs: 384 bytes)517.714 ns (5 allocs: 736 bytes)6.975 μs (181 allocs: 7.56 KiB)
(10, 10, 10)1.200 μs (14 allocs: 1.78 KiB)4.600 μs (11 allocs: 1.50 KiB)1.530 μs (11 allocs: 1.56 KiB)1.500 μs (14 allocs: 2.08 KiB)36.700 μs (950 allocs: 39.73 KiB)

Indexing with Vector{Int}

A Vector{Int} which is a collected StepRange.

Linear Indexing

maketable() do f, A
    ind = collect(firstindex(A):2:lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)296.139 ns (5 allocs: 496 bytes)107.175 ns (1 alloc: 96 bytes)123.386 ns (2 allocs: 160 bytes)177.180 ns (3 allocs: 336 bytes)1.190 μs (31 allocs: 1.22 KiB)
(10, 10)1.830 μs (8 allocs: 2.09 KiB)5.100 μs (1 alloc: 496 bytes)2.700 μs (2 allocs: 656 bytes)3.388 μs (3 allocs: 1.84 KiB)10.900 μs (301 allocs: 12.16 KiB)
(10, 10, 10)25.301 μs (8 allocs: 13.59 KiB)393.503 μs (1 alloc: 4.06 KiB)197.702 μs (2 allocs: 5.12 KiB)261.302 μs (3 allocs: 16.06 KiB)110.000 μs (3491 allocs: 128.86 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            collect(axe[begin:2:end])
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)293.487 ns (5 allocs: 496 bytes)106.774 ns (1 alloc: 96 bytes)119.583 ns (2 allocs: 160 bytes)175.797 ns (3 allocs: 336 bytes)1.190 μs (31 allocs: 1.22 KiB)
(10, 10)789.691 ns (9 allocs: 1.03 KiB)362.684 ns (1 alloc: 256 bytes)268.054 ns (3 allocs: 384 bytes)348.848 ns (5 allocs: 736 bytes)6.780 μs (181 allocs: 7.00 KiB)
(10, 10, 10)3.000 μs (20 allocs: 2.39 KiB)1.960 μs (8 allocs: 1.22 KiB)1.290 μs (8 allocs: 1.38 KiB)1.210 μs (11 allocs: 1.89 KiB)36.100 μs (950 allocs: 36.53 KiB)