-
Notifications
You must be signed in to change notification settings - Fork 939
Commit 5ad621f
authored
Fix LIKE with escapes (#6703)
* Fix LIKE with escapes
Fix LIKE processing for patterns containing escapes
- the starts_with / ends_with optimization did not correctly check for
escapes when checking rest of the pattern for being literal or not
- the pattern to regexp compiler incorrectly processed \ followed by a
character other than % or _. In PostgreSQL '\x' pattern matches single
'x'.
There are two tests
- like_escape_many was generated using PostgreSQL with the code attached
below for verification
- like_escape is hand-picked test cases that are more interesting.
Lower cardinality of hand-picked test cases allows for exercising all
scalar/array vs scalar/array combinations.
The below script isn't simples possible, because it was attempted to
generate more test cases by adding padding. Hence e.g.
is_like_without_dangling_escape. Since this is attached for reference,
should be attached as-is.
```python
import psycopg2
data = r"""
\
\\
\\\
\\\\
a
\a
\\a
%
\%
\\%
%%
\%%
\\%%
_
\_
\\_
__
\__
\\__
abc
a_c
a\bc
a\_c
%abc
\%abc
a\\_c%
""".split('\n')
data = list(dict.fromkeys(data))
conn = psycopg2.connect(host='localhost', port=5432, user='postgres', password='mysecretpassword')
conn.set_session(autocommit=True)
cursor = conn.cursor()
for r in data:
try:
# PostgreSQL verifies dandling escape only sometimes
cursor.execute(f"SELECT %s LIKE %s", (r, r))
is_like, = cursor.fetchone()
has_dandling_escape = False
pg_pattern = r
except Exception as e:
if 'LIKE pattern must not end with escape character' not in str(e):
raise e
has_dandling_escape = True
pg_pattern = r + '\\'
for l in data:
# print()
# print(' '.join(str(v) for v in (l, r, has_dandling_escape, postgres_pattern)))
cursor.execute(f"SELECT %s LIKE %s", (l, pg_pattern))
is_like, = cursor.fetchone()
assert type(is_like) is bool
if not is_like and has_dandling_escape:
pattern_without_escaped_dandling_escape = pg_pattern[:-2]
cursor.execute(f"SELECT %s LIKE %s", (l, pattern_without_escaped_dandling_escape))
is_like_without_dangling_escape, = cursor.fetchone()
assert type(is_like_without_dangling_escape) is bool
else:
is_like_without_dangling_escape = False
assert '"' not in l
assert '"' not in r
print('(r"%s", r"%s", %s),' % (
l, r,
str(is_like).lower(),
# str(has_dandling_escape).lower(),
# str(is_like_without_dangling_escape).lower(),
))
```
* Compact tests for regex_like
Reduce test code boilerplate and make it easier to see what are the test
cases.
* Add more test cases for regex_like1 parent 6dea453 commit 5ad621fCopy full SHA for 5ad621f
File tree
Expand file treeCollapse file tree
2 files changed
+1094
-59
lines changedFilter options
- arrow-string/src
Expand file treeCollapse file tree
2 files changed
+1094
-59
lines changed
0 commit comments