{"id":3147,"date":"2019-03-29T14:57:27","date_gmt":"2019-03-29T14:57:27","guid":{"rendered":"https:\/\/sandbox.weareadaptive.com\/?p=3147\/"},"modified":"2020-08-13T15:32:04","modified_gmt":"2020-08-13T14:32:04","slug":"antlr4-expression-parsing-part-2","status":"publish","type":"post","link":"https:\/\/sandbox.weareadaptive.com\/fr\/2019\/03\/29\/antlr4-expression-parsing-part-2\/","title":{"rendered":"ANTLR4 and expression parsing (Part 2)"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>If you have not read part 1 of this blog post, you can do so here:<\/p>\n<p><span class=\"btn-flat\"><a href=\"https:\/\/sandbox.weareadaptive.com\/2018\/11\/21\/using-antlr-parse-calculate-expressions-part\/\" target=\"_blank\" rel=\"noopener noreferrer\">Using ANTLR to parse and calculate expressions\u00a0(Part 1)<\/a><\/span><\/p>\n<p>In the previous post, we talked about creating a custom ANTLR4 grammar. Our grammar can parse formulas with basic mathematical operators and it defines some custom expressions that allow us to implement domain logic which, combined with the formula parsing, allow us to calculate domain-specific concepts like prices, amounts and conversion factors between units of measure.<\/p>\n<p>In this post, we\u2019ll get a decimal result out of some simple expressions and write some tests to make sure our initial calculator is working.<\/p>\n<p>In our grammar, we have two operators that have widespread use in probably every trading and banking system in the world. They are:<\/p>\n<ul>\n<li>UomConvert(FromUnitOfMeasure, ToUnitOfMeasure)<\/li>\n<li>FXRate(\u2018CurrencyPair\u2019)<\/li>\n<\/ul>\n<p>The first one converts between units of measure. The second one is a simplified FX rate. We can interpret it as an FX Spot Rate between two currencies. While <a href=\"https:\/\/en.wikipedia.org\/wiki\/Foreign_exchange_spot\" rel=\"noopener\">FX Spot Rates <\/a>are meaningless without a date component, for the purposes of this post, we will keep it simple and always assume \u2018latest known FX Rate<a href=\"https:\/\/en.wikipedia.org\/wiki\/Foreign_exchange_spot\" rel=\"noopener\">\u2019<\/a>.<\/p>\n<p>Our first step is to get a basic calculator working. To do that we\u2019ll need to be able to handle the different inputs in the following table.<\/p>\n<table width=\"822\">\n<tbody>\n<tr>\n<td width=\"109\">\n<h3 style=\"text-align: center;\"><strong>Input<\/strong><\/h3>\n<\/td>\n<td width=\"80\">\n<h3 style=\"text-align: center;\"><strong>Result<\/strong><\/h3>\n<\/td>\n<td width=\"492\">\n<h3><strong>Comments<\/strong><\/h3>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\" width=\"109\">2<\/td>\n<td style=\"text-align: center;\" width=\"80\">2<\/td>\n<td width=\"492\">Number identity should work.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\" width=\"109\">2 * 2<\/td>\n<td style=\"text-align: center;\" width=\"80\">4<\/td>\n<td width=\"492\">Standard math operators (addition, multiplication, subtraction and division)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\" width=\"109\">(2 + 2) * 4<\/td>\n<td style=\"text-align: center;\" width=\"80\">16<\/td>\n<td width=\"492\">Parenthesis priority should be respected.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Extending the base visitor<\/h2>\n<p>In our previous post we did all the plumbing necessary to generate some code that we can work with. This generated code is added automatically into our project thanks to the new C# project file (*.csproj) format.<\/p>\n<p>As a result of the plumbing work, we\u2019re telling the ANTLR runtime to generate a visitor. This visitor gives us some classes we can inherit from, and add some logic to, to get the desired result. The generated code will give us an expression tree, as well as the necessary tools to perform custom actions each time we visit a node in the tree. Nodes can be a recognized character, an operator, or an expression.<\/p>\n<p>Our main goal is to calculate the result of our expression. To do this, we first need to inherit from the MyGrammarBaseVisitor class, which has a generic type parameter that defines our expression\u2019s result type.<\/p>\n<p>To start, the visitor that will inherit from our base class will\u2014for now\u2014override the basic mathematical operations that we defined in our g<a href=\"https:\/\/github.com\/AdaptiveConsulting\/Blog\/blob\/076985e7d6ca9887679abded97f6af0837ddb911\/AntlrExpressions\/Part%202\/ExpressionParser\/src\/ExpressionParser.Language\/Expressions\/Grammar\/MyGrammar.g4#L14\" rel=\"noopener\">rammar file<\/a>. These are \u2018Number\u2019, \u2018Parens\u2019 (parentheses), and the operators \u2018ADD\u2019, \u2018SUB\u2019, \u2018MUL\u2019, and \u2018DIV\u2019, using decimal as a return type.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">public class MyGrammarVisitor : MyGrammarBaseVisitor&lt;decimal&gt;\r\n{\r\n    public override decimal VisitParens(MyGrammarParser.ParensContext context)\r\n    {\r\n        return base.VisitParens(context);\r\n    }\r\n\r\n    public override decimal VisitNum(MyGrammarParser.NumContext context)\r\n    {\r\n        return base.VisitNum(context);\r\n    }\r\n\r\n    public override decimal VisitAddSub(MyGrammarParser.AddSubContext context)\r\n    {\r\n        return base.VisitAddSub(context);\r\n    }\r\n\r\n    public override decimal VisitMulDiv(MyGrammarParser.MulDivContext context)\r\n    {\r\n        return base.VisitMulDiv(context);\r\n    }\r\n}\r\n<\/pre>\n<p>We override the operations defined in our grammar labeled with the # symbol (these are for convenience as they let us know which method names we need to override). The overridden methods cover all the operations we need to implement the calculator.<\/p>\n<h2>Context<\/h2>\n<p>Each overridden method is handed a context variable whose type is specific to the method we\u2019re overriding. This is extremely helpful because within the method we define what steps need taking via code, and which data to perform the steps on is provided by the parameter.<\/p>\n<p>It\u2019s time to implement the logic for each of these methods. But how? Let\u2019s go back to our <a href=\"https:\/\/github.com\/AdaptiveConsulting\/Blog\/blob\/076985e7d6ca9887679abded97f6af0837ddb911\/AntlrExpressions\/Part%202\/ExpressionParser\/src\/ExpressionParser.Language\/Expressions\/Grammar\/MyGrammar.g4#L14\" rel=\"noopener\">grammar definition<\/a> to remind ourselves:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">expr: expr op=(MUL | DIV) expr    #mulDiv\r\n| expr op=(ADD | SUB) expr        #addSub\r\n| number                          #num\r\n| &#039;(&#039; expr &#039;)&#039;                    #parens\r\n| fxRateFunc                      #fxRate \/\/let\u2019s ignore these last two for now.\r\n| uomConvertFunc                  #uomFactor\r\n;\r\n<\/pre>\n<p>Let\u2019s start simple, we\u2019ll first implement VisitNum. Because #num is just \u2018number\u2019 all we need to do is get the text content of our context parameter and parse it to decimal.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">public override decimal VisitNum(MyGrammarParser.NumContext context)\r\n{\r\n    return decimal.Parse(context.GetText());\r\n}\r\n<\/pre>\n<p>Easy peasy\u2026let\u2019s move on to VisitParens.<\/p>\n<p>In the grammar, parens is defined as such:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">| &#039;(&#039; expr &#039;)&#039;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 #parens<\/pre>\n<p>We have to visit the expression contained within the parenthesis, so we simply call Visit on it:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">public override decimal VisitParens(MyGrammarParser.ParensContext context)\r\n{\r\n    return Visit(context.expr());\r\n}\r\n<\/pre>\n<p>Note above how expr() is a method. This is because we\u2019ve recursively defined it: we need to treat is as a new expression, which is a tree itself, so we have to make sure we\u2019re traversing the entire tree. The method will then be dispatched into whatever Visit override is appropriate.<\/p>\n<p>The last two operators are similar, so we\u2019ll do them at the same time.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">expr: expr op=(MUL | DIV) expr\u00a0 \u00a0 \u00a0 \u00a0 #mulDiv\r\n| expr op=(ADD | SUB) expr              #addSub<\/pre>\n<p>As we can see from the grammar, a math operator has two expressions, one on each side. We need to first visit both the left and right sides to get our operands. Once we have these two values, we check what operator we\u2019re using and apply the proper calculation.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">public override decimal VisitAddSub(MyGrammarParser.AddSubContext context)\r\n{\r\n    \/\/ The result of this should always be the operands \r\n    var left = Visit(context.expr(0));\r\n    var right = Visit(context.expr(1));\r\n    return context.op.Type == MyGrammarParser.ADD ? left + right : left - right;\r\n}\r\n\r\npublic override decimal VisitMulDiv(MyGrammarParser.MulDivContext context)\r\n{\r\n    \/\/ The result of this should always be the operands \r\n    var left = Visit(context.expr(0));\r\n    var right = Visit(context.expr(1));\r\n    return context.op.Type == MyGrammarParser.MUL ? left * right : left \/ right;\r\n}\r\n<\/pre>\n<p>Note that access to the left\u2013 and right-hand sides is indexed-based, where 0 is left, and 1 is right. It\u2019s also worth noting how we need to check the operation\u2019s Type in order to know what operation we want to perform. This is all part of the generated code, and you can just browse to it in your IDE.<\/p>\n<h2>Let\u2019s parse a formula!<\/h2>\n<p>Now we need something that goes from an input string to a parsed visitor and ultimately a decimal. For that we need to construct an AntlrInputStream object, a lexer, and a TokenStream. This is all part of the plumbing that needs to happen before we can create an actual MyGrammarParser and send that expression tree through to our visitor. I\u2019ve created a class called MyGrammarExpressionEvaluator to do this part. The code is as follows:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">public static class MyGrammarExpressionEvaluator\r\n{\r\n    public static decimal Eval(string inputExpression)\r\n    {\r\n        using (var stringReader = new StringReader(inputExpression))\r\n        {\r\n            var inputStream = new AntlrInputStream(stringReader);\r\n            var lexer = new MyGrammarLexer(inputStream);\r\n            var tokens = new CommonTokenStream(lexer);\r\n            var parser = new MyGrammarParser(tokens);\r\n            var expressionTree = parser.expr();\/\/ lets parse the whole thing\r\n            var visitor = new MyGrammarVisitor(); \/\/ now that we have a &quot;parsed expression&quot;\r\n            var result = visitor.Visit(expressionTree);  \/\/ let&#039;s visit all the nodes.\r\n            return result;\r\n        }\r\n    }\r\n}\r\n<\/pre>\n<p>Now we can write some unit tests that will ensure we\u2019re on track. I\u2019ve used NUnit, with the TestCase attribute to reuse code and make sure that the basic math operators are doing what we expect them to do.<\/p>\n<p>Also note that we\u2019re testing expressions that contain parentheses:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"default\" data-enlighter-title=\"\">[TestCase(&quot;2+2&quot;, 4)]\r\n[TestCase(&quot;2*2&quot;, 4)]\r\n[TestCase(&quot;2-2&quot;, 0)]\r\n[TestCase(&quot;2\/2&quot;, 1)]\r\n[TestCase(&quot;(2+2)&quot;, 4)]\r\n[TestCase(&quot;(2+2)*2&quot;, 8)]\r\n[TestCase(&quot;(2+2)\/2&quot;, 2)]\r\npublic void BasicMathsWork(string numberExpression, decimal expectedResult)\r\n{\r\n    var result = MyGrammarExpressionEvaluator.Eval(numberExpression);\r\n   \r\n    Assert.That(result, Is.EqualTo(expectedResult));\r\n}\r\n<\/pre>\n<p>So far, so good; we\u2019ve got basic calculation working! Good times! No, really, now the plot thickens. I\u2019ve purposefully only implemented these operations, as the other two supported expressions\u2014fxRateFunc and uomConvertFunc\u2014require some more work before we can use them as input. See Part 3 (coming soon!) for the details on UomConvert!<\/p>\n<h2>Link to the Github Repo for this entry:<\/h2>\n<p><a href=\"https:\/\/github.com\/AdaptiveConsulting\/Blog\/tree\/master\/AntlrExpressions\/Part%202\/ExpressionParser\" rel=\"noopener\">https:\/\/github.com\/AdaptiveConsulting\/Blog\/tree\/master\/AntlrExpressions\/Part%202\/ExpressionParser<\/a><\/p>\n<h2><strong>Author:<\/strong><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-3163 alignleft\" src=\"https:\/\/sandbox.weareadaptive.com\/wp-content\/uploads\/2019\/03\/Carlos.png\" alt=\"\" width=\"272\" height=\"268\" srcset=\"https:\/\/sandbox.weareadaptive.com\/wp-content\/uploads\/2019\/03\/Carlos.png 464w, https:\/\/sandbox.weareadaptive.com\/wp-content\/uploads\/2019\/03\/Carlos-300x296.png 300w\" sizes=\"(max-width: 272px) 100vw, 272px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h3><strong><br \/>\nCarlos Fernandez<\/strong><\/h3>\n<p>Senior Software Engineer, Adaptive Barcelona<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction If you have not read part 1 of this blog post, you can do so here: Using ANTLR to &#8230;<\/p>\n","protected":false},"author":24,"featured_media":3065,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-3147","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/posts\/3147"}],"collection":[{"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/users\/24"}],"replies":[{"embeddable":true,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/comments?post=3147"}],"version-history":[{"count":0,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/posts\/3147\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/media\/3065"}],"wp:attachment":[{"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/media?parent=3147"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/categories?post=3147"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sandbox.weareadaptive.com\/fr\/wp-json\/wp\/v2\/tags?post=3147"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}