Skip to content

array_has function returns null for an empty list ([]) instead of false #16474

Open
@bert-beyondloops

Description

@bert-beyondloops

Describe the bug

When using the array_has function with an empty list, the result is

Currently :

array_has([], 1) => null.
array_has(null, 1) => null.

In my opinion, it should only return null when the list itself is null.

In duckDb:
array_has([], 1) => false.
array_has(null, 1) => null.

To Reproduce

CREATE TABLE foo ( bar INTEGER[] );

INSERT INTO foo VALUES 
( [1, 3, 5]),
(  []),
( null)
;

SELECT bar, IFNULL(CAST(list_contains(bar, 1) AS VARCHAR), 'null') FROM foo;

will result in :


+-----------+-----------------------------------------------+
| bar       | nvl(array_has(foo.bar,Int64(1)),Utf8("null")) |
+-----------+-----------------------------------------------+
| [1, 3, 5] | true                                          |
| []        | null                                          |
|           | null                                          |
+-----------+-----------------------------------------------+

whereas for example in duckDb you get :


┌───────────┬──────────────────────────────────────────────────────────┐
│    bar    │ COALESCE(CAST(list_contains(bar, 1) AS VARCHAR), 'null') │
│  int32[]  │                         varchar                          │
├───────────┼──────────────────────────────────────────────────────────┤
│ [1, 3, 5] │ true                                                     │
│ []        │ false                                                    │
│           │ null                                                     │
└───────────┴──────────────────────────────────────────────────────────┘

Expected behavior

array_has([], 1) => false.
array_has(null, 1) => null.

Additional context

According to me, the issue is situated in the array_has_dispatch_for_scalar method :

...
for (i, offset) in offsets.windows(2).enumerate() {
        let start = offset[0].to_usize().unwrap();
        let end = offset[1].to_usize().unwrap();
        let length = end - start;
        // For non-nested list, length is 0 for null
        if length == 0 {
            continue;
        }
        let sliced_array = eq_array.slice(start, length);
        final_contained[i] = Some(sliced_array.true_count() > 0);
    }
...

There is no distinction between null or length = 0.
You cannot only look at the offset difference. You have to look at the validity buffer as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions