You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Basically, this proposal originates as a more generic, universal and concrete version of the "distinct types" #1595 proposal, but it is also related to the "tags" #1099 proposal, since the idea here is essentially about adding comptime "metadata" to code.
Main benefit of the proposal is adding more type safety to primitive values like floats, ints and booleans.
Secondary benefit is that by making the labeling syntax generic and applicable "everywhere", it becomes both simpler and more versatile, possibly serving as a platform to implement also other comptime check related features mostly in userland. After all, sometimes features with very restriced applicability can be more confusing than generic and consistent features.
The syntax change of this proposal introduces only the [: token, and otherwise looks much like array indexing. Short preview of syntax:
a[:label1]; // attach "label1" to variable "a"b : f64[:label2]; // expect variable "b" to be of type f64 with label "label2" attached;fnuseLabeledBooleans(reset: bool[:reset], dstart: bool[:delayedStart], dstop: bool[:delayedStop]) void{
//...
}
useLabeledBooleans(true[:reset], false[:delayedStart], false[:delayedStop]);
//using wrong label would be caught during compilation
The syntax is verbose, but in turn it hides nothing of the annotated/labeled code from the reader.
Labels are there ONLY to make the compiler do additional sanity checks on code.
If you remove labels from your source, the EXACT same runtime assembly code is generated
These premises limit some of the imagined uses of labels, but in return these premises makes labels very easy to grasp mentally, which might be a good trade off. It also means that comptime labels should not impact runtime performance at all.
This would also open up the possibility of implementing labels in stage2 only. If the stage1 compiler sees a label, it can give an error and request the "don't process labels" compiler flag to be set.
Not sticking to these premises could also be considered if there would be any benefits in that, e.g if a sufficient number of keywords could be made redundant with labels.
Compiler utilization of comptime labels
With labels, the compiler can ask additional questions about code:
Given an operator, the operands and their labels ..determine if this expression is valid as per the labels. E.g, can enforce that it's not possible to add meters to seconds, even if they are both represented with a f64. Also, if bitwise OR on apples doesn't make sense, disallowing bitwise OR on integers representing apples is possible with labels.
Given a function call and the the arguments ..check whether the arguments are labeled in accordance with the function definition. With labels, you cannot pass an integer representing a bitflag to a function expecting an integer that represents a database ID .
Given an instance with labels for the different states it can be in ..check whether the state of the instance is appropriate for a given function call. This basically means the programmer is annotating state "manually" with labels, but the compiler can perform sanity checks using labels, whereas this is not possible with plain old code comments.
In short ..for every assignment, operator expression and function call, check label compatability (e.g between operands) and if there are any restrictions that the labels impose.
Implementation from a user perspective
With this proposal, new syntax must be introduced for applying the comptime labels within code, but configuring and creating custom labels requires no new syntax at all.
LabelTemplate ..built-in (standardized) interface, or convention on struct definitions, that the compiler knows how to translate into comptime checks. (Implements the callbacks mentioned above). Can have multiple LabelTemplates for different use-cases, some simple and user friendly, some more advanced and powerful.
LabelGroup ..struct def that follows/implements a LabelTemplate. These can be user defined or reside in a library.
labelInstance ..user or library defined instance of a LabelGroup struct. Simply a normal struct instance just like any other. Simplest form of a labelInstance is just a wrapped enum value, but any metadata is possible.
a[:labelInstance] ..read as, variable "a" is labeled with "labelInstance", instance of a LabelGroup (that conforms to a LabelTemplate).
a : f64[:labelInstance] ..read as, expect variable "a" to be of type f64 but also labeled with "labelInstance".
Outline of compiler "check labels" procedure
verify syntax, no "label brackets" ( [: tokens) in places where they don't belong
verify that all identifiers inside label brackets are valid labels. (instances comply to some predefined template)
verify that no labels are applied to a type they do not accept. (labels must directly or indirectly through a Label Template define a fn typeIsOk(comptime T : type) callback)
then, for all operations [note 1]:
verify that all operand associated labels are of the same group (instances of same struct)
unlabeled operands are treated as if they belong to some fictious "unlabeled" group
if one of the operands has the "infer me" label, it'll receive the same label as the other operand has
raise error if there's a label group conflict
verify with the "label group" that this combination of label instances in the given operation is allowed, and find out what label the operation result should have (Label groups/structs must define a fn operationIsOk(comptime op : ZigOperationEnum, orderedLabels : []@This()) @This() callback)
These callback functions could be implemented in userland by anoyone wishing to do a specific comptime checks utilizing labels. Though in most cases, more user friendly LabelTemplates would create wrappers around the callbacks, providing type checking "presets" to be further tailored by end users.
[note 1]: With operations, I mean unary and binary operators, function call parameter passing, assignments, "address of", indexing etc.. The more operations that are included, the more powerful typechecking with labels could become.
Programmer benefit
Simple and consistent syntax, where end user usage and configuration relies only on very fundamental zig features (structs and instances).
Just a few LabelTemplates defined in the compiler would be enough to enable custom comptime checks suitable for bitflags, physical units, currency, state annotations, ...
More robust refactoring even when relying on primitive and performant types. Compiler will alert you if you assume a "speed" f64 variable to be representing mph, when elsewhere in code it represents m/s.
Other considerations
[:_] ..syntax to infer labels.
// inferring labels with [:_] ...consta : f64[:_] =25.0[:labelInstance];
constb: f64[:labelInstance] =12.0[:_];
// perhaps out of scope:constareaOfCircle : _[:mathFunc] =fn(r: f64[:radius]) f64[:area]{ ... }[:_]
// currently "_" is not allowed as a placeholder for the type in a variable declaration, // but perhaps labeling structs or functions is not useful anyhow.
labelVar[:] ..becomes unlabeled: "[:], empty label" syntax to strip labels of a variable.
f64[:labelInstance] "subtypes" f64: ..any function or assignment that expects a non-labeled type will accept a label variable of the correct type, but the label will be discarded.
f64 does NOT "subtype" f64[:labelInstance]: ..stronger type checks with labels. Cannot pass a non-labeled variable to somewhere a labeled variable is expected, even if the base type matches.
labeledVariable[:newLabel] fails: ..cannot "relabel" variables. Must first strip existing label by assigning to a temp unlabeled variable, or by using some "strip label" syntax, e.g (labeledWithLabel1[:])[:newLabel]) or labeledWithLabel1[:][:newLabel]
Why [:...] syntax?: ..because this closely resembles unit notation in physics calculations, does not conflict with any existing syntax, and does not demand introduction of any sigils not already in use.
Multiple labels? ..one possible syntax is [:label1,label2] becomes "apply label1 AND label2", [:label1:label2] becomes "expect label1 OR label2 applied", [:(label1:label2),label3] becomes "expect label3 AND (label1 OR label2) applied
Handling of arrays: ..should it be possible to label the elements of an array only, or to label the array itself? If both should be allowed, what should the syntax difference be?
// difference between labeling an array or all elements of the array?constarr : [10]constu8[:ascii] =undefined;
constarr2 : ([10]constu8)[:unicode] =undefined;
LabelTemplate
LabelTemplates could be library/userland provided wrappers around the callback functions mentioned in the "outline of procedure" part above.
Examples:
SimpleGroup ..This LabelTemplate represents a demand for equal labels (equal instance) in assignments (lval and rval) and function calls (passed variable and function parameter). Implemented by wrapping an enum. Conforming LabelGroups must embed an enum, and labelInstances wrap a concrete enum value. Example: "PrimaryColor" LabelGroup that embeds const E = enum{red,green,blue}, yields three possible labelInstances, wrapping either E.red, E.green or E.blue. Operators like == or + simply remove the label and then return an unlabeled result if the operands are labeled.
CompoundMeasure ..this labeltemplate lets you add "unit" or "measure" metadata to primitives numbers, either integers or floats. Can be defined by forcing all LabelGroups implementing CompoundMeasure to have an exponent array of integers, where each entry represents the power of a unit. 4[m/s] => 4 [m=1,s=-1] => 4 [1,-1]. This exponent array is used by the compiler to determine whether operators or assignments are allowed. Two float operands labeled with the same exponent array can be added or subtracted from each other, for example.
NumberGroup ..this LabelTemplate is similar to SimpleGroup above, but with type checking enabled for operators as well. E.g a[:label1] * b[:label1] is allowed, a[:label1] * b[:label2] is not
Both NumberGroup and CompoundMeasure could also have options to tell compiler which operators are enabled or disabled. This would allow the creation of a "bitflag" number group that only allows
equality checks, bitwise operations, and perhaps left/right shift operations. This would make it a compile error to multiply bitflags.
IdGroup ..This LabelTemplate is like SimpleGroup, demanding equal labels for some operations, but also has an ID field (e.g u64) so that each label can have an unique ID if required. This might possibly be leveraged by tools (IDEs)
TypeRestrictGroup .. This LabelTemplate defines assertions that can be applied to types, as types are comptime known in zig. Would allow a form of comptime interfaces or traits to be implemented.
Hypothetical uses:
Distinct types with comptime labels: ...
Integer bitflags with type checking: (NumberGroup is applicable, see gist)
IDE ability to label autogenerated code, (speculative)
state annotation: ...
Unit prefixes, improved precision calculations: (NumberGroup or CompoundMeasure is applicable)
UTF8 is meaningful only in the context of a byte array, not so much a single byte? Sadly the syntax I've been considering doesn't look too good when it comes to labeling whole arrays and not just the elements. The parenthesis becomes necessary.
// explicit, not allowing aliased types to "embed" labels// quite verbosepubfntrim(slice : ([]constu8)[:utf8] , values_to_strip : ([]constu8)[:utf8]) ) ([]constu8)[:utf8] { ... }
// allowing creating new "labeled" types, less verbose, but also less explicit
`constutf8bytes= ([]constu8)[:utf8]
pubfntrim(slice : utf8bytes, values_to_strip: utf8bytes) utf8bytes { ... }
Note, using the implementation idea above, utf8 would be an instance of a custom LabelGroup struct implementing LabelTemplate "SimpleGroup", as you only need the labels to differentiate byte arrays representing different things.
It is a bit interesting to consider whether "new types" could be created just from aliasing an existing type and adding a label. I didn't consider it at first, but maybe it wouldn't really break the premise of being able to remove all labels from code without any change in runtime behavior. It does certainly remove the verbosity you'd have otherwise.
Proposal: Comptime Labels (tags)
Basically, this proposal originates as a more generic, universal and concrete version of the "distinct types" #1595 proposal, but it is also related to the "tags" #1099 proposal, since the idea here is essentially about adding comptime "metadata" to code.
Main benefit of the proposal is adding more type safety to primitive values like floats, ints and booleans.
Secondary benefit is that by making the labeling syntax generic and applicable "everywhere", it becomes both simpler and more versatile, possibly serving as a platform to implement also other comptime check related features mostly in userland. After all, sometimes features with very restriced applicability can be more confusing than generic and consistent features.
The syntax change of this proposal introduces only the
[:
token, and otherwise looks much like array indexing. Short preview of syntax:The syntax is verbose, but in turn it hides nothing of the annotated/labeled code from the reader.
More examples in this gist.
Premises
Two premises of this proposal are:
These premises limit some of the imagined uses of labels, but in return these premises makes labels very easy to grasp mentally, which might be a good trade off. It also means that comptime labels should not impact runtime performance at all.
This would also open up the possibility of implementing labels in stage2 only. If the stage1 compiler sees a label, it can give an error and request the "don't process labels" compiler flag to be set.
Not sticking to these premises could also be considered if there would be any benefits in that, e.g if a sufficient number of keywords could be made redundant with labels.
Compiler utilization of comptime labels
With labels, the compiler can ask additional questions about code:
Given an operator, the operands and their labels ..determine if this expression is valid as per the labels. E.g, can enforce that it's not possible to add meters to seconds, even if they are both represented with a f64. Also, if bitwise OR on apples doesn't make sense, disallowing bitwise OR on integers representing apples is possible with labels.
Given a function call and the the arguments ..check whether the arguments are labeled in accordance with the function definition. With labels, you cannot pass an integer representing a bitflag to a function expecting an integer that represents a database ID .
Given an instance with labels for the different states it can be in ..check whether the state of the instance is appropriate for a given function call. This basically means the programmer is annotating state "manually" with labels, but the compiler can perform sanity checks using labels, whereas this is not possible with plain old code comments.
In short ..for every assignment, operator expression and function call, check label compatability (e.g between operands) and if there are any restrictions that the labels impose.
Implementation from a user perspective
With this proposal, new syntax must be introduced for applying the comptime labels within code, but configuring and creating custom labels requires no new syntax at all.
LabelTemplate ..built-in (standardized) interface, or convention on struct definitions, that the compiler knows how to translate into comptime checks. (Implements the callbacks mentioned above). Can have multiple LabelTemplates for different use-cases, some simple and user friendly, some more advanced and powerful.
LabelGroup ..struct def that follows/implements a LabelTemplate. These can be user defined or reside in a library.
labelInstance ..user or library defined instance of a LabelGroup struct. Simply a normal struct instance just like any other. Simplest form of a labelInstance is just a wrapped enum value, but any metadata is possible.
a[:labelInstance] ..read as, variable "a" is labeled with "labelInstance", instance of a LabelGroup (that conforms to a LabelTemplate).
a : f64[:labelInstance] ..read as, expect variable "a" to be of type f64 but also labeled with "labelInstance".
Outline of compiler "check labels" procedure
verify syntax, no "label brackets" (
[:
tokens) in places where they don't belongverify that all identifiers inside label brackets are valid labels. (instances comply to some predefined template)
verify that no labels are applied to a type they do not accept. (labels must directly or indirectly through a Label Template define a
fn typeIsOk(comptime T : type)
callback)then, for all operations [note 1]:
verify that all operand associated labels are of the same group (instances of same struct)
verify with the "label group" that this combination of label instances in the given operation is allowed, and find out what label the operation result should have (Label groups/structs must define a
fn operationIsOk(comptime op : ZigOperationEnum, orderedLabels : []@This()) @This()
callback)These callback functions could be implemented in userland by anoyone wishing to do a specific comptime checks utilizing labels. Though in most cases, more user friendly LabelTemplates would create wrappers around the callbacks, providing type checking "presets" to be further tailored by end users.
[note 1]: With operations, I mean unary and binary operators, function call parameter passing, assignments, "address of", indexing etc.. The more operations that are included, the more powerful typechecking with labels could become.
Programmer benefit
Simple and consistent syntax, where end user usage and configuration relies only on very fundamental zig features (structs and instances).
Just a few LabelTemplates defined in the compiler would be enough to enable custom comptime checks suitable for bitflags, physical units, currency, state annotations, ...
More robust refactoring even when relying on primitive and performant types. Compiler will alert you if you assume a "speed" f64 variable to be representing mph, when elsewhere in code it represents m/s.
Other considerations
labelVar[:] ..becomes unlabeled: "[:], empty label" syntax to strip labels of a variable.
f64[:labelInstance] "subtypes" f64: ..any function or assignment that expects a non-labeled type will accept a label variable of the correct type, but the label will be discarded.
f64 does NOT "subtype" f64[:labelInstance]: ..stronger type checks with labels. Cannot pass a non-labeled variable to somewhere a labeled variable is expected, even if the base type matches.
labeledVariable[:newLabel] fails: ..cannot "relabel" variables. Must first strip existing label by assigning to a temp unlabeled variable, or by using some "strip label" syntax, e.g
(labeledWithLabel1[:])[:newLabel])
orlabeledWithLabel1[:][:newLabel]
Why [:...] syntax?: ..because this closely resembles unit notation in physics calculations, does not conflict with any existing syntax, and does not demand introduction of any sigils not already in use.
Multiple labels? ..one possible syntax is [:label1,label2] becomes "apply label1 AND label2", [:label1:label2] becomes "expect label1 OR label2 applied", [:(label1:label2),label3] becomes "expect label3 AND (label1 OR label2) applied
Handling of arrays: ..should it be possible to label the elements of an array only, or to label the array itself? If both should be allowed, what should the syntax difference be?
LabelTemplate
LabelTemplates could be library/userland provided wrappers around the callback functions mentioned in the "outline of procedure" part above.
Examples:
SimpleGroup ..This LabelTemplate represents a demand for equal labels (equal instance) in assignments (lval and rval) and function calls (passed variable and function parameter). Implemented by wrapping an enum. Conforming LabelGroups must embed an enum, and labelInstances wrap a concrete enum value. Example: "PrimaryColor" LabelGroup that embeds
const E = enum{red,green,blue}
, yields three possible labelInstances, wrapping eitherE.red
,E.green
orE.blue
. Operators like == or + simply remove the label and then return an unlabeled result if the operands are labeled.CompoundMeasure ..this labeltemplate lets you add "unit" or "measure" metadata to primitives numbers, either integers or floats. Can be defined by forcing all LabelGroups implementing CompoundMeasure to have an exponent array of integers, where each entry represents the power of a unit.
4[m/s] => 4 [m=1,s=-1] => 4 [1,-1]
. This exponent array is used by the compiler to determine whether operators or assignments are allowed. Two float operands labeled with the same exponent array can be added or subtracted from each other, for example.NumberGroup ..this LabelTemplate is similar to SimpleGroup above, but with type checking enabled for operators as well. E.g
a[:label1] * b[:label1]
is allowed,a[:label1] * b[:label2]
is notBoth NumberGroup and CompoundMeasure could also have options to tell compiler which operators are enabled or disabled. This would allow the creation of a "bitflag" number group that only allows
equality checks, bitwise operations, and perhaps left/right shift operations. This would make it a compile error to multiply bitflags.
IdGroup ..This LabelTemplate is like SimpleGroup, demanding equal labels for some operations, but also has an ID field (e.g u64) so that each label can have an unique ID if required. This might possibly be leveraged by tools (IDEs)
TypeRestrictGroup .. This LabelTemplate defines assertions that can be applied to types, as types are comptime known in zig. Would allow a form of comptime interfaces or traits to be implemented.
Hypothetical uses:
The text was updated successfully, but these errors were encountered: