Trino Adding a New Function

Data Al Dente

Home | Posts | About Me

Created

Updated

Trino Version 405

How to add a new function to Trino, based on this work:

https://github.com/trinodb/trino/issues/14725

https://github.com/trinodb/trino/compare/master...nathanwilk7:trino:nathanwilk7/operator-array-histogram?diff=unified

https://github.com/trinodb/trino/pull/16024

trino requires arrays to be all the same type

select array_histogram(a) from (values (array[1,2,1]), (array[42,7,42,null])) t(a);

what is SPI? seems like it has the base primitives stuff

scalarfunction

why is class final?

typeparameter

type

sqltype

block

// Just created a random test class to mess with blocks
// IntArrayBlock looks like the simplest one to show how things work
Block block = new IntArrayBlock(1, 2, new boolean[] {false, false, true}, new int[] {2, 4, 6, 8});
block.getInt(i, 0) // 0 gives 4, 1 gives 6, 2 throws
block.isNull(i) // 0 gives false, 1 gives true, 2 throws
fixedSizeInBytesPerPosition() // 5 (4 bytes for int, 1 byte for null bool)
getSizeInBytes // 10
block.getRegionSizeInBytes(100, 999) // 4995, ignores actual data
block.getPositionsSizeInBytes(null, 999) // 4995, ignores first param
block.getRetainedSizeInBytes // 91, does some fancy java magic to get size of getEstimatedDataSizeForStats // some kind of logical data size (null is 0, ints are 4)
retainedBytesForEachPart // skipped for now
getPositionCount
getInt
mayHaveNull // just a shallow check if the nulls array is non-null
isNull // just checks the position in the null array (if it's non-null)
getSingleValueBlock // creates new intarrayblock with just this position copied into it
copyPositions // copies the arr of positions into a new intarrayblock 
getRegion // copies length from position offset
copyRegion // same as get but actually copies underlying array
copyWithAppendedNull // copies the whole thing but puts a null on the end
getValuesSlice // not override, gives direct values access

note creating a block in a test and then debugging it so you can quickly test a bunch of funcs is nice

blocks are immutable once created

it’s confusing to me that positionCount can be 0 even if array is non-empty, only lets you access positionCount things

i find the block interface very strange since it has a bunch of default unsupported methods, why not have more find-grained interfaces for things like byte, int, etc?

why can you still getInt when it is null?

Here are the input block classes for the array hist func

blockbuilder

slice

operatordependency

arr specific

map specific

hist specific

testing

docs