Performance of not

Optimizations for not with special index types

For faster indexing, this package provides optimizations for not with some special index types, which means not(x) is not equivalent to FI(!in(x)) for x belonging to those index types, and will be converted to different index types by to_indices function.

There are two list of those index types, both of which are enabled for not, but for customized NotIndex types, optimizations in second list is enabled when indextype returns Vector{Int}.

There optimizations are enabled for any AbstractNotIndex types by default:

  • x::Colon will be converted to an empty Int array: Int[];
  • x::AbstractArray{Bool} will be converted to a LogicalIndex with mask mappedarray(!, x);
  • x::AbstractArray will be converted like FI(!in(x′)), while x′ is a Set like array converted from x with faster in;
  • For I′, J′ = to_indices(A, (not(I), not(J))), not(I′) and not(I′) will revert to I, J.

There optimizations are enabled only if indextype is defined as Vector{Int}:

  • x::Integer or x::OrdinalRange{<:Integer} will be converted to an Int array where x is removed from given axis.
  • x::Base.Slice will be converted to an empty Int array, when the slice represents the given axis, Otherwise, it will be treated as a normal AbstractUnitRange.

Performance tips for not

For small array, the optimized not(x) might be slower in some case, because of the overhead for creating a Set, see below.

There are some tips for better performance:

  • Use FI(!in(x)) instead of not(x).
  • Create your own "Not" type, see below example for details.
  • For a small array of indices, not(1, 2, 3) will faster than not([1, 2, 3]).

Performance comparing

This is a performance comparing of "Inverted Indexing" for different methods. There are five methods:

  • bynot: by not of this package;
  • byfi: by function index FI(!in(I));
  • bymap: by logical indices which test map(!in(I), axis);
  • byfilter: by removing I from axes by filter;
  • byNot: by InvertedIndices.Not.

The minimum time and allocation of each method for different type of index were compared:

using BenchmarkTools
using InvertedIndices
using FunctionIndices
using Latexify

# Linear
bynot(A, I) = A[not(I)]
byfi(A, I) = A[FI(!in(I))]
bymap(A, I) = A[map(!in(I), begin:end)]
byfilter(A, I) = A[filter(!in(I), begin:end)]
byNot(A, I) = A[Not(I)]

# Cartesian
bynot(A, Is...) = A[ntuple(i -> not(Is[i]), Val(length(Is)))...]
byfi(A, Is...) = A[ntuple(i -> FI(!in(Is[i])), Val(length(Is)))...]
bymap(A, Is...) = A[ntuple(i -> map(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byfilter(A, Is...) = A[ntuple(i -> filter(!in(Is[i]), axes(A, i)), Val(length(Is)))...]
byNot(A, Is...) = A[ntuple(i -> Not(Is[i]), Val(length(Is)))...]

const As = (rand(10), rand(10, 10), rand(10, 10, 10))
const fs = (bynot, byfi, bymap, byfilter, byNot)

# convert tuple of tuple to matrix
tt2mt(trs) = hcat(map(tr -> vcat(tr...), trs)...)

maketable(bench, As::Tuple=As, fs::Tuple=fs) = mdtable(
    # use map of map instead of for loop for typle stable
    map(
        f -> map(
            A -> begin
                trial = bench(f, A)
                trialmin = minimum(trial)
                trialallocs = allocs(trialmin)
                string(
                    BenchmarkTools.prettytime(time(trialmin)),
                    " (", trialallocs , " alloc",
                    trialallocs == 1 ? "" : "s", ": ",
                    BenchmarkTools.prettymemory(memory(trialmin)), ")"
                )
            end,
            As
        ),
        fs
    ) |> tt2mt;
    head=fs, side=[:Size; size.(As)...], latex=false,
);
maketable (generic function with 3 methods)

Indexing with Int

A random inbounds Int index.

Linear Indexing

maketable() do f, A
    ind = rand(firstindex(A):lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)102.438 ns (2 allocs: 256 bytes)58.698 ns (1 alloc: 128 bytes)97.583 ns (2 allocs: 192 bytes)166.801 ns (3 allocs: 400 bytes)1.989 μs (48 allocs: 1.56 KiB)
(10, 10)263.972 ns (2 allocs: 1.75 KiB)184.913 ns (1 alloc: 896 bytes)248.141 ns (2 allocs: 1.03 KiB)423.272 ns (3 allocs: 2.62 KiB)21.200 μs (534 allocs: 17.50 KiB)
(10, 10, 10)2.078 μs (2 allocs: 15.88 KiB)1.770 μs (1 alloc: 7.94 KiB)1.690 μs (2 allocs: 9.00 KiB)3.638 μs (3 allocs: 23.81 KiB)218.402 μs (6244 allocs: 187.72 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(i -> rand(axes(A, i)), Val(ndims(A)))
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)105.403 ns (2 allocs: 256 bytes)59.045 ns (1 alloc: 128 bytes)98.005 ns (2 allocs: 192 bytes)168.321 ns (3 allocs: 400 bytes)1.950 μs (47 allocs: 1.53 KiB)
(10, 10)271.475 ns (3 allocs: 992 bytes)227.081 ns (1 alloc: 736 bytes)327.636 ns (3 allocs: 864 bytes)410.558 ns (5 allocs: 1.25 KiB)19.800 μs (517 allocs: 16.53 KiB)
(10, 10, 10)2.011 μs (11 allocs: 6.41 KiB)2.544 μs (8 allocs: 6.03 KiB)2.900 μs (8 allocs: 6.19 KiB)2.189 μs (11 allocs: 6.80 KiB)182.402 μs (4774 allocs: 152.19 KiB)

Indexing with UnitRange

A random inbound UnitRange with half-length of axe.

Linear Indexing

maketable() do f, A
    axe = firstindex(A):lastindex(A)
    half = length(axe) ÷ 2
    b = rand(axe[begin:(end - half)])
    e = b + half
    @benchmark $f($A, $(b:e))
end
SizebynotbyfibymapbyfilterbyNot
(10,)92.252 ns (2 allocs: 192 bytes)61.851 ns (1 alloc: 96 bytes)100.212 ns (2 allocs: 160 bytes)154.132 ns (3 allocs: 336 bytes)944.000 ns (25 allocs: 1008 bytes)
(10, 10)160.844 ns (2 allocs: 896 bytes)157.665 ns (1 alloc: 448 bytes)229.233 ns (2 allocs: 608 bytes)345.502 ns (3 allocs: 1.75 KiB)10.400 μs (250 allocs: 8.36 KiB)
(10, 10, 10)1.310 μs (2 allocs: 8.12 KiB)1.400 μs (1 alloc: 4.06 KiB)1.510 μs (2 allocs: 5.12 KiB)3.025 μs (3 allocs: 16.06 KiB)107.201 μs (3072 allocs: 117.48 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            half = length(axe) ÷ 2
            b = rand(axe[begin:(end - half)])
            e = b + half
            b:e
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)92.478 ns (2 allocs: 192 bytes)60.794 ns (1 alloc: 96 bytes)100.317 ns (2 allocs: 160 bytes)154.358 ns (3 allocs: 336 bytes)973.333 ns (24 allocs: 928 bytes)
(10, 10)174.357 ns (3 allocs: 384 bytes)137.373 ns (1 alloc: 192 bytes)200.169 ns (3 allocs: 320 bytes)297.280 ns (5 allocs: 672 bytes)4.614 μs (118 allocs: 4.41 KiB)
(10, 10, 10)945.455 ns (14 allocs: 1.23 KiB)1.050 μs (11 allocs: 976 bytes)1.060 μs (11 allocs: 1.02 KiB)1.170 μs (14 allocs: 1.53 KiB)21.100 μs (585 allocs: 22.67 KiB)

Indexing with StepRange

A StepRange with step 2.

Linear Indexing

maketable() do f, A
    ind = firstindex(A):2:lastindex(A)
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)136.449 ns (2 allocs: 192 bytes)247.153 ns (1 alloc: 96 bytes)183.996 ns (2 allocs: 160 bytes)257.981 ns (3 allocs: 336 bytes)1.190 μs (31 allocs: 1.31 KiB)
(10, 10)522.518 ns (2 allocs: 992 bytes)2.233 μs (1 alloc: 496 bytes)1.230 μs (2 allocs: 656 bytes)1.390 μs (3 allocs: 1.84 KiB)10.700 μs (301 allocs: 12.95 KiB)
(10, 10, 10)4.800 μs (2 allocs: 8.12 KiB)22.100 μs (1 alloc: 4.06 KiB)11.700 μs (2 allocs: 5.12 KiB)13.500 μs (3 allocs: 16.06 KiB)110.401 μs (3491 allocs: 136.69 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            axe[begin:2:end]
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)136.668 ns (2 allocs: 192 bytes)246.926 ns (1 alloc: 96 bytes)188.254 ns (2 allocs: 160 bytes)258.384 ns (3 allocs: 336 bytes)1.190 μs (31 allocs: 1.31 KiB)
(10, 10)281.053 ns (3 allocs: 448 bytes)907.692 ns (1 alloc: 256 bytes)413.568 ns (3 allocs: 384 bytes)521.476 ns (5 allocs: 736 bytes)6.860 μs (181 allocs: 7.56 KiB)
(10, 10, 10)1.220 μs (14 allocs: 1.78 KiB)4.557 μs (11 allocs: 1.50 KiB)1.530 μs (11 allocs: 1.56 KiB)1.490 μs (14 allocs: 2.08 KiB)36.800 μs (950 allocs: 39.73 KiB)

Indexing with Vector{Int}

A Vector{Int} which is a collected StepRange.

Linear Indexing

maketable() do f, A
    ind = collect(firstindex(A):2:lastindex(A))
    @benchmark $f($A, $ind)
end
SizebynotbyfibymapbyfilterbyNot
(10,)293.490 ns (5 allocs: 496 bytes)107.619 ns (1 alloc: 96 bytes)120.199 ns (2 allocs: 160 bytes)178.859 ns (3 allocs: 336 bytes)1.180 μs (31 allocs: 1.22 KiB)
(10, 10)1.850 μs (8 allocs: 2.09 KiB)5.100 μs (1 alloc: 496 bytes)2.711 μs (2 allocs: 656 bytes)3.400 μs (3 allocs: 1.84 KiB)10.699 μs (301 allocs: 12.16 KiB)
(10, 10, 10)25.700 μs (8 allocs: 13.59 KiB)393.503 μs (1 alloc: 4.06 KiB)197.702 μs (2 allocs: 5.12 KiB)261.202 μs (3 allocs: 16.06 KiB)112.001 μs (3491 allocs: 128.86 KiB)

Multidimensional Indexing

maketable() do f, A
    inds = ntuple(
        i -> begin
            axe = axes(A, i)
            collect(axe[begin:2:end])
        end,
        Val(ndims(A))
    )
    @benchmark $f($A, $inds...)
end
SizebynotbyfibymapbyfilterbyNot
(10,)290.840 ns (5 allocs: 496 bytes)107.313 ns (1 alloc: 96 bytes)119.065 ns (2 allocs: 160 bytes)178.425 ns (3 allocs: 336 bytes)1.180 μs (31 allocs: 1.22 KiB)
(10, 10)798.889 ns (9 allocs: 1.03 KiB)363.814 ns (1 alloc: 256 bytes)272.384 ns (3 allocs: 384 bytes)357.624 ns (5 allocs: 736 bytes)6.820 μs (181 allocs: 7.00 KiB)
(10, 10, 10)3.013 μs (20 allocs: 2.39 KiB)1.950 μs (8 allocs: 1.22 KiB)1.260 μs (8 allocs: 1.38 KiB)1.200 μs (11 allocs: 1.89 KiB)36.000 μs (950 allocs: 36.53 KiB)