Skip to content

Optional Rigorous Range Check to Prevent Potential Overflow Vulnerability in LessThan(8) Usage #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
52faf55
add is_safe option to check the bit-length of the input
Koukyosyumei Jan 9, 2025
d079e8e
use SemiSafeLessThan that only checks the first argument of LessThan …
Koukyosyumei Jan 10, 2025
68a79f2
fix the grammar error
Koukyosyumei Jan 10, 2025
21c4fe8
introduce 'is_safe' options for safer compairosn with LessThan
Koukyosyumei Jan 13, 2025
4ad3b62
update the template parameters of affected tests
Koukyosyumei Jan 13, 2025
5ec8cfc
introduce 'is_safe' options for safer compairosn with LessThan
Koukyosyumei Jan 13, 2025
2cf46c8
introduce the is_safe option to the generated template
Koukyosyumei Jan 13, 2025
dd6a535
fix a typo
Koukyosyumei Jan 13, 2025
ee55887
add an explanation about is_safe parameter
Koukyosyumei Jan 13, 2025
546f09f
revert the changes
Koukyosyumei Jan 13, 2025
9e00ae8
introduce is_safe option for rigorous range check
Koukyosyumei Jan 13, 2025
4e9d35f
add an explanation about is_safe option
Koukyosyumei Jan 13, 2025
8b9832d
update examples to include -i option
Koukyosyumei Jan 14, 2025
63a7dc9
revert the temporal change of test scripts
Koukyosyumei Jan 14, 2025
1525747
revert the temporal change of test scripts
Koukyosyumei Jan 14, 2025
a0fd840
revert the temporal change of test scripts
Koukyosyumei Jan 14, 2025
c3b26ad
revert the temporal change of test scripts
Koukyosyumei Jan 14, 2025
45e8a3a
improve the explanation of --is-safe opetion
Koukyosyumei Jan 20, 2025
c3afa82
move the explanation about --is-safe opetion to the Note section
Koukyosyumei Jan 20, 2025
2dbaac9
Merge branch 'main' into safe-mode-for-LessThan
Koukyosyumei Feb 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ yarn install
`zk-regex` is a CLI to compile a user-defined regex to the corresponding regex circuit.
It provides two commands: `raw` and `decomposed`

#### `zk-regex decomposed -d <DECOMPOSED_REGEX_PATH> -c <CIRCOM_FILE_PATH> -t <TEMPLATE_NAME> -g <GEN_SUBSTRS (true/false)>`
#### `zk-regex decomposed -d <DECOMPOSED_REGEX_PATH> -c <CIRCOM_FILE_PATH> -t <TEMPLATE_NAME> -g <GEN_SUBSTRS (true/false)> -i <IS_SAFE (true/false)>`
This command generates a regex circom from a decomposed regex definition.
For example, if you want to verify the regex of `email was meant for @(a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z)+.` and reveal alphabets after @, you can define the decomposed regex as follows.
```
Expand All @@ -86,7 +86,10 @@ You can generate its regex circom as follows.
1. Make the above json file at `./simple_regex_decomposed.json`.
2. Run `zk-regex decomposed -d ./simple_regex_decomposed.json -c ./simple_regex.circom -t SimpleRegex -g true`. It outputs a circom file at `./simple_regex.circom` that has a `SimpleRegex` template.

#### `zk-regex raw -r <RAW_REGEX> -s <SUBSTRS_JSON_PATH> -c <CIRCOM_FILE_PATH> -t <TEMPLATE_NAME> -g <GEN_SUBSTRS (true/false)>`
> [!NOTE]
> If the `-i (--is_safe)` option is not explicitly set to true, the generated Circom template performs a less rigorous range check for each character of the input string, which may inadvertently allow excessively large values. However, this issue is **not critical**, as all inputs to the regex templates are text bytes assumed to be less than 255. When `is_safe` is set to true, the output Circom template includes an additional 9 constraints per character, ensuring a strict range check for each character.

#### `zk-regex raw -r <RAW_REGEX> -s <SUBSTRS_JSON_PATH> -c <CIRCOM_FILE_PATH> -t <TEMPLATE_NAME> -g <GEN_SUBSTRS (true/false)> -i <IS_SAFE (true/false)>`
This command generates a regex circom from a raw string of the regex definition and a json file that defines state transitions in DFA to be revealed.
For example, to verify the regex `1=(a|b) (2=(b|c)+ )+d` and reveal its alphabets,
1. Visualize DFA of the regex using [this website](https://zkregex.com).
Expand Down
20 changes: 20 additions & 0 deletions packages/circom/circuits/regex_helpers.circom
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,24 @@ template IsNotZeroAcc() {

signal is_zero <== IsZero()(in);
out <== acc + (1 - is_zero);
}

template SemiSafeLessThan(n) {
assert(n <= 252);
signal input in[2];
signal output out;

component aInRange = Num2Bits(n);
aInRange.in <== in[0];

// In this project, in[1] is always 255.
// component bInRange = Num2Bits(n);
// bInRange.in <== in[1];

component lt = LessThan(n);

lt.in[0] <== in[0];
lt.in[1] <== in[1];

out <== lt.out;
}
16 changes: 16 additions & 0 deletions packages/compiler/src/bin/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
//! - `-c, --circom-file-path <PATH>`: File path for Circom output
//! - `-t, --template-name <NAME>`: Template name
//! - `-g, --gen-substrs`: Generate substrings
//! - `-i, --is_safe``: Performs rigorous checks on the range of each character in the input string, adding 9 additional constraints per character
//!
//! Example:
//! ```
Expand All @@ -39,11 +40,18 @@
//! - `-c, --circom-file-path <PATH>`: File path for Circom output
//! - `-t, --template-name <NAME>`: Template name
//! - `-g, --gen-substrs`: Generate substrings
//! - `-i, --is_safe``: Performs rigorous checks on the range of each character in the input string, adding 9 additional constraints per character
//!
//! Example:
//! ```
//! zk-regex raw -r "a*b+c?" -s substrings.json -h ./halo2_output -c ./circom_output.circom -t MyTemplate -g true
//! ```
//!
//! ## Note
//! The `-i (--is_safe)` option controls the rigor of range checks for input characters:
//! - If not set to `true`, the generated Circom template uses less rigorous range checks, which may allow excessively large values.
//! - This is usually not critical, as input text bytes are assumed to be less than 255.
//! - When `is_safe` is `true`, the template adds 9 extra constraints per character, ensuring strict range checks.

use clap::{Parser, Subcommand};
use zk_regex_compiler::{gen_from_decomposed, gen_from_raw};
Expand All @@ -68,6 +76,8 @@ enum Commands {
template_name: Option<String>,
#[arg(short, long)]
gen_substrs: Option<bool>,
#[arg(short, long)]
is_safe: Option<bool>,
},
Raw {
#[arg(short, long)]
Expand All @@ -82,6 +92,8 @@ enum Commands {
template_name: Option<String>,
#[arg(short, long)]
gen_substrs: Option<bool>,
#[arg(short, long)]
is_safe: Option<bool>,
},
}

Expand All @@ -100,6 +112,7 @@ fn process_decomposed(cli: Cli) {
circom_file_path,
template_name,
gen_substrs,
is_safe,
} = cli.command
{
if let Err(e) = gen_from_decomposed(
Expand All @@ -108,6 +121,7 @@ fn process_decomposed(cli: Cli) {
circom_file_path.as_deref(),
template_name.as_deref(),
gen_substrs,
is_safe,
) {
eprintln!("Error: {}", e);
std::process::exit(1);
Expand All @@ -123,6 +137,7 @@ fn process_raw(cli: Cli) {
circom_file_path,
template_name,
gen_substrs,
is_safe,
} = cli.command
{
if let Err(e) = gen_from_raw(
Expand All @@ -132,6 +147,7 @@ fn process_raw(cli: Cli) {
circom_file_path.as_deref(),
template_name.as_deref(),
gen_substrs,
is_safe,
) {
eprintln!("Error: {}", e);
std::process::exit(1);
Expand Down
17 changes: 16 additions & 1 deletion packages/compiler/src/circom.rs
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,7 @@ fn generate_declarations(
and_i: usize,
multi_or_i: usize,
end_anchor: bool,
is_safe: bool,
) -> Vec<String> {
let mut declarations = vec![
"pragma circom 2.1.5;\n".to_string(),
Expand All @@ -587,7 +588,15 @@ fn generate_declarations(
"\tsignal in_range_checks[msg_bytes];".to_string(),
"\tin[0]<==255;".to_string(),
"\tfor (var i = 0; i < msg_bytes; i++) {".to_string(),
"\t\tin_range_checks[i] <== LessThan(8)([msg[i], 255]);".to_string(),
format!(
"\t\tin_range_checks[i] <== {}(8)([msg[i], 255]);",
if is_safe {
"SemiSafeLessThan"
} else {
"LessThan"
}
)
.to_string(),
"\t\tin_range_checks[i] === 1;".to_string(),
"\t\tin[i+1] <== msg[i];".to_string(),
"\t}".to_string(),
Expand Down Expand Up @@ -734,6 +743,7 @@ fn gen_circom_allstr(
template_name: &str,
regex_str: &str,
end_anchor: bool,
is_safe: bool,
) -> Result<String, CompilerError> {
let state_len = dfa_graph.states.len();

Expand All @@ -751,6 +761,7 @@ fn gen_circom_allstr(
and_i,
multi_or_i,
end_anchor,
is_safe,
);

let init_code = generate_init_code(state_len);
Expand Down Expand Up @@ -966,12 +977,14 @@ pub(crate) fn gen_circom_template(
circom_path: &Path,
template_name: &str,
gen_substrs: bool,
is_safe: bool,
) -> Result<(), CompilerError> {
let circom = gen_circom_allstr(
&regex_and_dfa.dfa,
template_name,
&regex_and_dfa.regex_pattern,
regex_and_dfa.has_end_anchor,
is_safe,
)?;

let mut file = File::create(circom_path)?;
Expand Down Expand Up @@ -1001,12 +1014,14 @@ pub(crate) fn gen_circom_template(
pub(crate) fn gen_circom_string(
regex_and_dfa: &RegexAndDFA,
template_name: &str,
is_safe: bool,
) -> Result<String, CompilerError> {
let circom = gen_circom_allstr(
&regex_and_dfa.dfa,
template_name,
&regex_and_dfa.regex_pattern,
regex_and_dfa.has_end_anchor,
is_safe,
)?;
let substrs = add_substrs_constraints(regex_and_dfa)?;
let result = circom + &substrs;
Expand Down
11 changes: 11 additions & 0 deletions packages/compiler/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ fn generate_outputs(
circom_template_name: Option<&str>,
num_public_parts: usize,
gen_substrs: bool,
is_safe: bool,
) -> Result<(), CompilerError> {
if let Some(halo2_dir_path) = halo2_dir_path {
let halo2_dir_path = PathBuf::from(halo2_dir_path);
Expand All @@ -85,6 +86,7 @@ fn generate_outputs(
&circom_file_path,
&circom_template_name,
gen_substrs,
is_safe,
)?;
}

Expand All @@ -110,10 +112,12 @@ pub fn gen_from_decomposed(
circom_file_path: Option<&str>,
circom_template_name: Option<&str>,
gen_substrs: Option<bool>,
is_safe: Option<bool>,
) -> Result<(), CompilerError> {
let mut decomposed_regex_config: DecomposedRegexConfig =
serde_json::from_reader(File::open(decomposed_regex_path)?)?;
let gen_substrs = gen_substrs.unwrap_or(false);
let is_safe = is_safe.unwrap_or(false);

let regex_and_dfa = get_regex_and_dfa(&mut decomposed_regex_config)?;

Expand All @@ -130,6 +134,7 @@ pub fn gen_from_decomposed(
circom_template_name,
num_public_parts,
gen_substrs,
is_safe,
)?;

Ok(())
Expand All @@ -156,13 +161,15 @@ pub fn gen_from_raw(
circom_file_path: Option<&str>,
template_name: Option<&str>,
gen_substrs: Option<bool>,
is_safe: Option<bool>,
) -> Result<(), CompilerError> {
let substrs_defs_json = load_substring_definitions_json(substrs_json_path)?;
let num_public_parts = substrs_defs_json.transitions.len();

let regex_and_dfa = create_regex_and_dfa_from_str_and_defs(raw_regex, substrs_defs_json)?;

let gen_substrs = gen_substrs.unwrap_or(true);
let is_safe = is_safe.unwrap_or(false);

generate_outputs(
&regex_and_dfa,
Expand All @@ -171,6 +178,7 @@ pub fn gen_from_raw(
template_name,
num_public_parts,
gen_substrs,
is_safe,
)?;

Ok(())
Expand All @@ -193,8 +201,10 @@ pub fn gen_circom_from_decomposed_regex(
circom_file_path: Option<&str>,
circom_template_name: Option<&str>,
gen_substrs: Option<bool>,
is_safe: Option<bool>,
) -> Result<(), CompilerError> {
let gen_substrs = gen_substrs.unwrap_or(false);
let is_safe = is_safe.unwrap_or(false);

let regex_and_dfa = get_regex_and_dfa(decomposed_regex)?;

Expand All @@ -211,6 +221,7 @@ pub fn gen_circom_from_decomposed_regex(
circom_template_name,
num_public_parts,
gen_substrs,
is_safe,
)?;

Ok(())
Expand Down
18 changes: 13 additions & 5 deletions packages/compiler/src/wasm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ use self::circom::gen_circom_string;
pub fn genFromDecomposed(
decomposedRegexJson: &str,
circomTemplateName: &str,
is_safe: bool,
) -> Result<String, JsValue> {
let mut decomposed_regex_config: DecomposedRegexConfig =
serde_json::from_str(decomposedRegexJson).map_err(|e| {
Expand All @@ -24,18 +25,24 @@ pub fn genFromDecomposed(
))
})?;

gen_circom_string(&regex_and_dfa, circomTemplateName)
gen_circom_string(&regex_and_dfa, circomTemplateName, is_safe)
.map_err(|e| JsValue::from_str(&format!("Failed to generate Circom string: {}", e)))
}

#[wasm_bindgen]
#[allow(non_snake_case)]
pub fn genFromRaw(rawRegex: &str, substrsJson: &str, circomTemplateName: &str) -> String {
pub fn genFromRaw(
rawRegex: &str,
substrsJson: &str,
circomTemplateName: &str,
is_safe: bool,
) -> String {
let substrs_defs_json: SubstringDefinitionsJson =
serde_json::from_str(substrsJson).expect("failed to parse substrs json");
let regex_and_dfa = create_regex_and_dfa_from_str_and_defs(rawRegex, substrs_defs_json)
.expect("failed to convert the raw regex and state transitions to dfa");
gen_circom_string(&regex_and_dfa, circomTemplateName).expect("failed to generate circom")
gen_circom_string(&regex_and_dfa, circomTemplateName, is_safe)
.expect("failed to generate circom")
}

#[wasm_bindgen]
Expand All @@ -52,10 +59,11 @@ pub fn genRegexAndDfa(decomposedRegex: JsValue) -> JsValue {

#[wasm_bindgen]
#[allow(non_snake_case)]
pub fn genCircom(decomposedRegex: JsValue, circomTemplateName: &str) -> String {
pub fn genCircom(decomposedRegex: JsValue, circomTemplateName: &str, is_safe: bool) -> String {
let mut decomposed_regex_config: DecomposedRegexConfig =
from_value(decomposedRegex).expect("failed to parse decomposed regex");
let regex_and_dfa = get_regex_and_dfa(&mut decomposed_regex_config)
.expect("failed to convert the decomposed regex to dfa");
gen_circom_string(&regex_and_dfa, circomTemplateName).expect("failed to generate circom")
gen_circom_string(&regex_and_dfa, circomTemplateName, is_safe)
.expect("failed to generate circom")
}