Make `CommonSubexprEliminate` faster by stop copying so many strings

### Is your feature request related to a problem or challenge?

Part of https://github.com/apache/datafusion/issues/5637

One of the optimizer passes is "common subexpression elimination" that removes redundant computation

However, as @peter-toth  noted on https://github.com/apache/datafusion/pull/10396 and the CSE code says

https://github.com/apache/datafusion/blob/d58bae487329b7a7078429f083bffc611f42c8c7/datafusion/optimizer/src/common_subexpr_eliminate.rs#L108-L119


The way it tracks common subexpressions is with string manipulation is is non ideal for several reasons (including the cost of creating those strings)

### Describe the solution you'd like


Revisit the identifiers as using these string identifiers as the keys of `ExprStats` was not the best choice. Please note this is how CSE has been working since the feature was added initially.

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

	/// Identifier for each subexpression.
	///
	/// Note that the current implementation uses the `Display` of an expression
	/// (a `String`) as `Identifier`.
	///
	/// An identifier should (ideally) be able to "hash", "accumulate", "equal" and "have no
	/// collision (as low as possible)"
	///
	/// Since an identifier is likely to be copied many times, it is better that an identifier
	/// is small or "copy". otherwise some kinds of reference count is needed. String description
	/// here is not such a good choose.
	type Identifier = String;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `CommonSubexprEliminate` faster by stop copying so many strings #10426

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make CommonSubexprEliminate faster by stop copying so many strings #10426

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Make `CommonSubexprEliminate` faster by stop copying so many strings #10426