Skip to content

TreeRewriter optimizations and default diagnostics engine #721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 53 additions & 25 deletions lib/parser/source/tree_rewriter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -83,14 +83,10 @@ module Source
# @!attribute [r] source_buffer
# @return [Source::Buffer]
#
# @!attribute [r] diagnostics
# @return [Diagnostic::Engine]
#
# @api public
#
class TreeRewriter
attr_reader :source_buffer
attr_reader :diagnostics

##
# @param [Source::Buffer] source_buffer
Expand All @@ -99,22 +95,18 @@ def initialize(source_buffer,
crossing_deletions: :accept,
different_replacements: :accept,
swallowed_insertions: :accept)
@diagnostics = Diagnostic::Engine.new
@diagnostics.consumer = -> diag { $stderr.puts diag.render }

@diagnostics = nil
@source_buffer = source_buffer
@in_transaction = false

@policy = {crossing_deletions: crossing_deletions,
different_replacements: different_replacements,
swallowed_insertions: swallowed_insertions}.freeze
check_policy_validity
@crossing_deletions = check_policy_value(crossing_deletions)
@different_replacements = check_policy_value(different_replacements)
@swallowed_insertions = check_policy_value(swallowed_insertions)

@enforcer = method(:enforce_policy)
# We need a range that would be jugded as containing all other ranges,
# including 0...0 and size...size:
all_encompassing_range = @source_buffer.source_range.adjust(begin_pos: -1, end_pos: +1)
@action_root = TreeRewriter::Action.new(all_encompassing_range, @enforcer)
@action_root = TreeRewriter::Action.new(all_encompassing_range, self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing self as an argument is usually a sign of a bad composition (just like passing a private instance method 😄 )

Also, the argument on the TreeRewriter::Action is not an enforcer anymore, it's a tree_rewriter, and so it introduces a bi-directional dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very legitimate comment. I didn't document it, but I thought that asking for an object responding to call, or an object responding to enforce_policy was pretty similar. That's also the idea in not renaming it; it's an enforcer and needs responding to a single call enforce_policy... It just so happens that TreeRewriter responds to enforce_policy ;-)

Do you have a suggestion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like both original/new implementations from this POV. Bi-directional relationships between objects cause troubles from time to time, so I'd try to avoid it.

Previously only a single private method was shared with Action, and so it was a smaller violation of encapsulation (but it still was a violation). I'd personally keep the original version only for that reason.

At the same time your implementation looks better to me in terms of types and ifaces. Both implementation have issues, the question is which one has more downsides and I don't see "the only" answer here. Up to you to decide.

end

##
Expand Down Expand Up @@ -326,6 +318,24 @@ def transaction
@in_transaction = previous
end

##
# Provides access to a diagnostic engine.
# By default outputs diagnostic to $stderr
#
def self.default_diagnostics
@default_diagnostics ||= Diagnostic::Engine.new.tap do |engine|
engine.consumer = -> diag { $stderr.puts diag.render }
end
end

##
# Provides access to a diagnostic engine.
# By default: self.class.default_diagnostics
#
def diagnostics
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it still could be set in the constructor. Having initialization in a single place (like it was before) seems to be more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it would be in the constructor if it didn't need to be dup'ed, but it does...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but even if you need .dup you still can do it in the constructor. Initialization of this field doesn't have to be lazy

@diagnostics ||= self.class.default_diagnostics.dup
end

def in_transaction?
@in_transaction
end
Expand Down Expand Up @@ -355,21 +365,32 @@ def insert_after_multi(range, text)

extend Deprecation

##
# @api private
# reserved for TreeAction
#
def enforce_policy(event)
return if policy(event) == :accept
return unless (values = yield)
trigger_policy(event, **values)
end

protected

attr_reader :action_root

private

ACTIONS = %i[accept warn raise].freeze
def check_policy_validity
invalid = @policy.values - ACTIONS
raise ArgumentError, "Invalid policy: #{invalid.join(', ')}" unless invalid.empty?
ACTIONS = %i[accept warn raise].to_set.freeze
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually better to use sets for 3 symbols? IIRC sets in ruby are hashes, and small hashes are arrays. Do you get anything from this to_set? I thought it makes it slower because computing .hash and comparing it is slower than comparing symbols that are numbers internally. Not a blocker at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. I'm not sure I can explain why, but even for a three element set, a non matching include check is faster on a Set than on an Array, so is a matching check on the 2nd or 3rd element. Only a matching check on the first element seems faster on an Array.
Beyond performance, I feel that an object to call include? on should be a Set, and Set should optimize that call...

def check_policy_value(value)
raise ArgumentError, "Invalid policy value: #{value}" unless ACTIONS.include?(value)

value
end

def combine(range, attributes)
range = check_range_validity(range)
action = TreeRewriter::Action.new(range, @enforcer, **attributes)
action = TreeRewriter::Action.new(range, self, **attributes)
@action_root = @action_root.combine(action)
self
end
Expand All @@ -381,21 +402,28 @@ def check_range_validity(range)
range
end

def enforce_policy(event)
return if @policy[event] == :accept
return unless (values = yield)
trigger_policy(event, **values)
EVENT_TO_POLICY = {
crossing_deletions: :@crossing_deletions,
different_replacements: :@different_replacements,
swallowed_insertions: :@swallowed_insertions,
}.freeze

def policy(event)
return :raise if event == :crossing_insertions

instance_variable_get(EVENT_TO_POLICY.fetch(event))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rewrite it to case. The difference in terms of speed it too small but it makes it harder to follow. Also, this way you can combine it with return :raise above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. You're absolutely right.

end

POLICY_TO_LEVEL = {warn: :warning, raise: :error}.freeze
def trigger_policy(event, range: raise, conflict: nil, **arguments)
action = @policy[event] || :raise
action = policy(event)
diag = Parser::Diagnostic.new(POLICY_TO_LEVEL[action], event, arguments, range)
@diagnostics.process(diag)
engine = @diagnostics || self.class.default_diagnostics
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be just diagnostic (or @diagnostic if you move initialization back to constructor)

engine.process(diag)
if conflict
range, *highlights = conflict
diag = Parser::Diagnostic.new(POLICY_TO_LEVEL[action], :"#{event}_conflict", arguments, range, highlights)
@diagnostics.process(diag)
engine.process(diag)
end
raise Parser::ClobberingError, "Parser::Source::TreeRewriter detected clobbering" if action == :raise
end
Expand Down
6 changes: 3 additions & 3 deletions lib/parser/source/tree_rewriter/action.rb
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ def check_fusible(action, *fusible)
return if fusible.empty?
fusible.each do |child|
kind = action.insertion? || child.insertion? ? :crossing_insertions : :crossing_deletions
@enforcer.call(kind) { {range: action.range, conflict: child.range} }
@enforcer.enforce_policy(kind) { {range: action.range, conflict: child.range} }
end
fusible
end
Expand All @@ -222,15 +222,15 @@ def merge(action)
end

def call_enforcer_for_merge(action)
@enforcer.call(:different_replacements) do
@enforcer.enforce_policy(:different_replacements) do
if @replacement && action.replacement && @replacement != action.replacement
{range: @range, replacement: action.replacement, other_replacement: @replacement}
end
end
end

def swallow(children)
@enforcer.call(:swallowed_insertions) do
@enforcer.enforce_policy(:swallowed_insertions) do
insertions = children.select(&:insertion?)

{range: @range, conflict: insertions.map(&:range)} unless insertions.empty?
Expand Down
13 changes: 13 additions & 0 deletions test/test_source_tree_rewriter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,19 @@ def test_ordered_replacements
result.map {|r, s| [r.to_range, s]}
)
end

def test_default_diagnostics
default = Parser::Source::TreeRewriter.default_diagnostics
before = default.consumer
diagnostics = []
default.consumer = -> diag { diagnostics << diag.render }
rewriter = Parser::Source::TreeRewriter.new(@buf, swallowed_insertions: :warn)
rewriter.replace(@ll, 'xx')
rewriter.remove(@hello)
assert_equal(2, diagnostics.size)
ensure
default.consumer = before
end
end

class TestSourceTreeRewriterImport < Minitest::Test
Expand Down