Monday, October 19, 2009

A quick look at J

In this post I'm going to show a small overview of the J programming language. An example of polynomial multiplication is examined.

J


J is an array programming language derived from APL which means is good for manipulating arrays and matrices. Its syntax and semantics are very different from other languages which makes it an interesting topic for studying.

There's a lot of documentation and examples available from the J software website. Two complete tutorials are "Learning J" by Roger Stokes and "J for C programmers" by Henry Rich.

The J distribution also includes examples and tutorials. The examples will be presented using J's REPL called jconsole.

The example



In order to give an overview of the language I'm going to start from the definition of a function that performs polynomial multiplication and examine how it was constructed.

The function is defined as follows:

polymulti =: dyad : '+/ (((_1 * i. #y) (|. "0 1) ((x (*"_ 0) y) (,"1 1) (((#y) - 1) $ 0))) , 0)'


Writing this example helped me understand some of the J's basic concepts (I'm completely sure there's a better/more efficient way to do this!).

Polynomial multiplication



The basic technique for polynomial multiplication consists on multiplying each term of one of the polynomials by the order and them simply the result. For example:


(4x3 - 23x2 + x + 1) * (2x2 - x - 3)

= ((4x3 - 23x2 + x + 1) * 2x2) +
((4x3 - 23x2 + x + 1) * -x) +
((4x3 - 23x2 + x + 1) * -3)

= (8x5 - 46x4 + 2x3 + 2x2) +
(-4x4 + 23x3 - x2 - x) +
(-12x3 + 69x2 - 3x - 3)

= (8x5 - 50x4 + 13x3 + 70x2 - 4x - 3)


Now I'll start creating polymulti from the bottom up to the definition.

1. Defining polynomials



As mentioned above J is a nice language for manipulating arrays. We're going to use J arrays to define polynomials. In fact J supports some operations on arrays as polynomials which will be described bellow.

In order to write a new array literal containing 1,1, -23 and 4 in J we write (here using jconsole):


1 1 _23 4
1 1 _23 4


As you can see the elements of the array are separated by space. Also negative numbers are prefixed by underscore '_'. This array is going to be used to represent the coefficients of the "(4x3 - 23x2 + x + 1)" polynomial.

We're going to define two variables with the polynomials shown above to use them for examples:


p1 =: 1 1 _23 4
p2 =: _3 _1 2


Here the '=:' operator is used to bind the specified arrays with p1 and p2.

2. Multiply each element of one of the polynomials



We can proceed to apply the first step in the process which is multiply each element of the second operand by the first operand.

In J we can operate on arrays easily, for example if we want to multiply each element of the above array by a 2 we write:


1 1 _23 4 * 2
2 2 _46 8


In J the an operation that receives two arguments is called a dyad and a operation that receives only one is called monad. Here we're using the '*' dyad to perform the multiplication.

We cannot directly go and type "p1 * p2" because we will get the following error:


p1 * p2
|length error
| p1 *p2


This happens because the length of the two arrays (4 and 3) could not be used to perform the operation. We can apply '*' to a two same size arrays and J will multiply each element. For example:


p1 * p1
1 1 529 16


Now what I want is to multiply each element of 'p2' by 'p1'. In order to do this we can change the behavior of the '*' dyad by specifying its rank (more details on how to do this can be found in "Verb Execution -- How Rank Is Used (Dyads)") . To change the rank of '*' and say that we want to multiply the complete array on the left for each of the cells of the right array we write (*"_ 0) .For example:


p1 (*"_ 0) p2
_3 _3 69 _12
_1 _1 23 _4
2 2 _46 8


By specifying (*"_ 0) we say that the left operand has "infinite rank" (_) which means it will consider the array as single unit.We also say that for the right argument we will consider every cell (by using the 0 rank). Notice that in order to modify the rank we use double quote (") which, in J, doesn't have to paired with another double quote as in most programming languages.

Also you can notice that the result of this operation is an array of arrays, one for each element of the right argument.

2. Sum each element of one of the polynomials

Now that we have an array of polynomials with the result of multiplying the coefficients, we need to change the degree of each of the polynomials. First we need to make more space to increment the degree of the polynomials. We're going to use the ',' dyad which lets you concatenate two arrays. For example:


p1 , 0 0
1 1 _23 4 0 0


We need to append an array that is the size of the second polynomial minus one. In order to create a new array of this size we use '$' which lets you create an array or matrix by specifying the size and the value of the elements of the new array. For example:


10 $ 0
0 0 0 0 0 0 0 0 0 0


The size of the array can be obtained with the '#' monad. For example:


# p1
4


Now we can combine all these elements to resize each array resulting from the multiplication of the coefficients like this:


(p1 (*"_ 0) p2)
_3 _3 69 _12
_1 _1 23 _4
2 2 _46 8
(((#p2) - 1 ) $ 0)
0 0
(p1 (*"_ 0) p2) (,"1 1) ( ((#p2) - 1 ) $ 0)
_3 _3 69 _12 0 0
_1 _1 23 _4 0 0
2 2 _46 8 0 0


Here we use (((#p2) - 1 ) $ 0) to generate an array of zeros of the desired size. Then we use (,"1 1) which is the ',' dyad with a modified rank saying that each element of the left matrix (an array) will be concatenated with the second array (since the second argument is a single dimension array we could have said (,"1 _) ).

With the arrays resized we can change the degree of our polynomial array. In order to do this we're going to use the '|.' dyad which let's you rotate an array an specified amount of positions. For example:

 
1 |. 1 2 3
2 3 1
_1 |. 1 2 3
3 1 2


As shown in the example, by specifying a positive number the array will be rotated to the left and a negative number to the right.

Now the question is, how to rotate each element of the polynomial array by a different element count? Before showing that we're going to introduce the 'i.' primitive which lets you create an array of with a sequence of numbers for example, to create an array of 10 numbers (starting with 0) we write:


i. 10
0 1 2 3 4 5 6 7 8 9


We can use this primitive in conjunction with the rotate primitive to say


m =: (p1 (*"_ 0) p2) (,"1 1) ( ((#p2) - 1 ) $ 0)
i. #p2
0 1 2
_1 * i. #p2
0 _1 _2
(_1 * i. #p2) (|."0 1) m
_3 _3 69 _12 0 0
0 _1 _1 23 _4 0
0 0 2 2 _46 8


As you can see we use the array generated by (_1 * i. #p2) to specify how many positions are we moving. We apply the change the rank of '|.' by saying (|."0 1) which means apply '|.' for each single cell of the left array to each row of the right array.

The previous step generated an array polynomials to be summed. So the only thing left is to generate the final polynomial. In order to do this we use the '+/' dyad which let's use sum the contents of an array. For example:


+/ 5 3 4 1
13


We can apply +/ to any array and J will do the operation as expected.


+/ (_1 * i. #p2) (|."0 1) m
_3 _4 70 13 _50 8


As a work around I'm appending a zero row to this matrix, just in case the p2 array is a 0 degree polynomial.


+/ ((_1 * i. #p2) (|."0 1) m) , 0
_3 _4 70 13 _50 8


And that's it we have the complete process for multiplying the array.

3. Create the function

Now putting all the elements described above we can put together the polynomial multiplication dyad:

polymulti =: dyad : '+/ (((_1 * i. #y) (|. "0 1) ((x (*"_ 0) y) (,"1 1) (((#y) - 1) $ 0))) , 0)'


The 'x' and 'y' names are the implicit names of the left and right operands.

We can use it with any pair of polynomials. For example:


2 34 3 polymulti 4 3
8 142 114 9
2 34 3 polymulti 4 0 3
8 136 18 102 9
2 34 3 polymulti 1
2 34 3


4. Use the polynomials

J already has support for polynomials inside the library. For example by using the 'p.' we can get an evaluation of a given polynomial.


3 4 5 p. 34
5919
3 + (4*34) + (5*34*34)
5919
_2 p. 34
_2
(5 3 _2 polymulti 3 0 0 _2)
15 9 _6 _10 _6 4
(5 3 _2 polymulti 3 0 0 _2) p. 3
204


Final words



It was very interesting to read about J. It offers a different perspective on programming which is definitely worth studying. I have to admit it was difficult at the beginning because of the number of concepts to learn ( only a couple are presented in this post) and the syntax . For future posts I'm going to try to explore more J features.

Tuesday, October 6, 2009

AS3 Getter/Setter support in AbcExplorationLib

Recently I added initial support for reading and writing AS3/Avm2 getters and setters to AbcExplorationLib.

Given the following ActionScript class:


class Complex {
public var radius:Number;
public var angle:Number;
public function Complex(ar:Number,aa:Number):void {
radius = ar;
angle = aa;
}

public function set imaginary(newImaginary:Number):void
{
var oldReal = this.real;
angle = Math.atan(newImaginary/oldReal);
radius = Math.sqrt(oldReal*oldReal + newImaginary*newImaginary);
}
public function get imaginary():Number
{
return radius*Math.sin(angle);
}

public function set real(newReal:Number):void
{
var oldImaginary = this.real;
angle = Math.atan(oldImaginary/newReal);
radius = Math.sqrt(newReal*newReal + oldImaginary*oldImaginary);
}
public function get real():Number
{
return radius*Math.cos(angle);
}
}


We compile it using the Flex SDK:
java -jar c:\flexsdk\lib\asc.jar complexclasstest.as


Then we can load it using the library:


Microsoft F# Interactive, (c) Microsoft Corporation, All Rights Reserved
F# Version 1.9.6.16, compiling for .NET Framework Version v2.0.50727

Please send bug reports to fsbugs@microsoft.com
For help type #help;;

> #r "abcexplorationlib.dll";;

--> Referenced 'abcexplorationlib.dll'

> open System.IO;;
> open Langexplr.Abc;;
> let f = using (new FileStream("complexclasstest.abc",FileMode.Open)) (fun s -> AvmAbcFile.Create(s));;

val f : AvmAbcFile

> let complexClass = List.hd f.Classes;;

val complexClass : AvmClass


> complexClass.Properties |> List.map (fun p -> p.Name);;
val it : QualifiedName list =
[CQualifiedName (Ns ("",PackageNamespace),"real");
CQualifiedName (Ns ("",PackageNamespace),"imaginary")]
> let realGetter = complexClass.Properties |> List.map (fun p -> p.Getter.Value) |> List.hd;;

val realGetter : AvmMemberMethod:
> realGetter.Method.Body.Value.Instructions |> Array.map (fun x -> x.Name);;
val it : string array =
[|"getlocal_0"; "pushscope"; "getlocal_0"; "getproperty";
"findpropertystrict"; "getproperty"; "getlocal_0"; "getproperty";
"callprop"; "multiply"; "returnvalue"|]





The library can be found here.

Thursday, October 1, 2009

Parsing Javascript using Newspeak parsing combinators

I've been working on a parser for Javascript/Ecmascript using Newspeak parsing combinators. The parser is based on the grammar presented in the ECMAScript Language Specification [PDF] document. It is still incomplete, however it can parse simple statements.

The grammar looks like this:


class JSGrammar = ExecutableGrammar (
"Experiment for JS grammar based on the description from http://www.ecma-international.org/publications/standards/Ecma-262.htm"
|
doubleQuote = (char: $").
backslash = (char: $\).
str = doubleQuote,((backslash, ( char: $" )) |
(backslash, ( char: $/ )) |
(backslash, backslash) |
(backslash, ( char: $r )) |
(backslash, ( char: $n )) |
(backslash, ( char: $t )) |
(charExceptFor: $")) star, doubleQuote.
string = tokenFor: str.

tilde = char: $~.
exclamation = char: $!.
starChar = char: $*.
slash = char: $/.
modulo = char: $%.
pipe = char: $|.
amp = char: $&.
cir = char: $^.
question = char: $?.
colon = char: $:.
semicolon = char: $;.

negSign = (char: $-).
plusSign = (char: $+).
digit = (charBetween: $0 and: $9).
dot = (char: $. ) .
lt = char: $<.
gt = char: $>.
eq = char: $=.
num = negSign opt, digit, digit star, dot opt,digit star, ((char: $e) | (char: $E)) opt, (plusSign | negSign) opt,digit star.
number = tokenFor: num.

tQuestion = tokenFor: question.
tColon = tokenFor: colon.
tplusSign = tokenFor: plusSign.
tnegSign = tokenFor: negSign.
tmodulo = tokenFor: modulo.
tslash = tokenFor: slash.
tstarChar = tokenFor: starChar.
texclamation = tokenFor:exclamation.
tdot = tokenFor:dot.
tLt = tokenFor: lt.
tGt = tokenFor: gt.
tEq = tokenFor: eq.
tAmp = tokenFor: amp.
tPipe = tokenFor: pipe.
tCir = tokenFor: cir.
tSlash = tokenFor: slash.
tSemicolon = tokenFor: semicolon.

tStarEq = tstarChar,eq.
tModEq = tmodulo,eq.
tSlashEq = tSlash,eq.
tPlusEq = tplusSign,eq.
tMinusEq = tnegSign,eq.
tAmpAmp = tAmp,amp.
tPipePipe = tPipe,pipe.
tLtEq = tLt,eq.
tGtEq = tGt,eq.
tleftShift = tLt,lt.
trightShift = tGt,gt.
tsRightShift = tGt,gt,gt.
tEqEq = tEq,eq.
tEqEqEq = tEq,eq,eq.
tNotEq = texclamation,eq.
tNotEqEq = texclamation,eq,eq.
tleftShiftEq = tleftShift,eq.
trightShiftEq = trightShift,eq.
tsRightShiftEq = tsRightShift,eq.
tAmpEq = tAmp,eq.
tPipeEq = tPipe,eq.
tCirEq = tCir,eq.

lineTerminator = (char: (Character lf)) | (char: (Character cr)).

regularExpressionLiteral =
tslash,
( ((backslash, ( charExceptForCharIn: { (Character lf). (Character cr). })) |
(charExceptForCharIn: { (Character lf). (Character cr). $/.})) plus),
slash, (identifierStart star).

leftparen = tokenFromChar: $(.
rightparen =tokenFromChar: $).

leftbrace = tokenFromChar: ${.
rightbrace =tokenFromChar: $}.
comma = tokenFromChar: $,.
propertyName = string | identifier | number.
propertyNameAndValue = propertyName,tColon,value.
obj = leftbrace, (propertyNameAndValue starSeparatedBy: comma),rightbrace.
object = obj.

leftbracket = tokenFromChar: $[.
rightbracket = tokenFromChar: $].
arr = leftbracket, (value starSeparatedBy: comma), rightbracket.
array = tokenFor: arr.

comment = (slash,starChar,blockCommentBody,starChar,slash) | (slash,slash, lineCommentBody).


ttrue = tokenFromSymbol: #true.
tfalse = tokenFromSymbol: #false.
null = tokenFromSymbol: #null.
function = tokenFromSymbol: #function.
tnew = tokenFromSymbol: #new.
break = tokenFromSymbol: #break.
case = tokenFromSymbol: #case.
catch = tokenFromSymbol: #catch.
continue = tokenFromSymbol: #continue.
default = tokenFromSymbol: #default.
delete = tokenFromSymbol: #delete.
do = tokenFromSymbol: #do.
else = tokenFromSymbol: #else.
finally = tokenFromSymbol: #finally.
for = tokenFromSymbol: #for.
if = tokenFromSymbol: #if.
in = tokenFromSymbol: #in.
instanceof = tokenFromSymbol: #instanceof.
return = tokenFromSymbol: #return.
switch = tokenFromSymbol: #switch.
this = tokenFromSymbol: #this.
throw = tokenFromSymbol: #throw.
try = tokenFromSymbol: #try.
typeof = tokenFromSymbol: #typeof.
var = tokenFromSymbol: #var.
void = tokenFromSymbol: #void.
while = tokenFromSymbol: #while.
with = tokenFromSymbol: #with.



letter = (charBetween: $a and: $z) | (charBetween: $A and: $Z).
identifierStart = letter | (char: $$) | (char: $_).
identifier = accept: (tokenFor: (identifierStart), (identifierStart | digit) star) ifNotIn: keywords .

value = assignmentExpression .

literal = null | ttrue | tfalse | number | string | regularExpressionLiteral.

primaryexpression = this | literal | identifier | array | object | parenthesized.

parenthesized = leftparen,expression,rightparen.

functionexpression = function , identifier opt,
leftparen,formalParameterList , rightparen ,
leftbrace,sourceElements,rightbrace.
formalParameterList = identifier starSeparatedBy: comma.

memberexpression = (simplememberexpression ),
(( leftbracket, expression, rightbracket) | ( tdot, identifier)) star.

simplememberexpression = primaryexpression |
functionexpression |
simpleNewExpression.
simpleNewExpression = tnew,memberexpression, arguments.



callExpression = (simplememberexpression ),
( arguments |
( leftbracket, expression, rightbracket) |
( tdot, identifier) ) star.
simpleCallExpression = memberexpression , arguments.
arguments = leftparen ,
(assignmentExpression, (comma, assignmentExpression) star) opt,
rightparen.

newExpression = memberexpression | simpleNewMemberExpression.
simpleNewMemberExpression = tnew, memberexpression.

plusPlus = plusSign,plusSign.
minusMinus = negSign,negSign.

leftHandSideExpression = callExpression | newExpression .
postfixExpression = leftHandSideExpression ,
((plusPlus | minusMinus) star).


unaryExpression = postfixExpression | complexUnaryExpression.

complexUnaryExpression =
(typeof, unaryExpression) |
(delete, unaryExpression) |
(void, unaryExpression) |
(plusPlus, unaryExpression) |
(minusMinus, unaryExpression) |
((tokenFor: ( plusSign | negSign | tilde | exclamation )), unaryExpression).

multiplicativeExpression =
unaryExpression, ((tstarChar | tslash | tmodulo), unaryExpression) star.

additiveExpression =
multiplicativeExpression, ((tplusSign | tnegSign), multiplicativeExpression) star.

shiftExpression =
additiveExpression, ((tsRightShift | tleftShift | trightShift), additiveExpression) star.

relationalExpression =
shiftExpression, (( tLtEq | tGtEq | tLt | tGt | instanceof | in) , shiftExpression) star.

relationalExpressionNoIn =
shiftExpression, (( tLtEq | tGtEq | tLt | tGt | instanceof ) , shiftExpression) star.

equalityExpression =
relationalExpression, ((tEqEqEq | tEqEq | tNotEqEq | tNotEq ), relationalExpression) star.

equalityExpressionNoIn =
relationalExpressionNoIn, ((tEqEqEq | tEqEq | tNotEqEq | tNotEq ), relationalExpressionNoIn) star.

bitwiseANDExpression =
equalityExpression,(tAmp, equalityExpression) star.

bitwiseANDExpressionNoIn =
equalityExpressionNoIn,(tAmp, equalityExpressionNoIn) star.

bitwiseXORExpression =
bitwiseANDExpression,(tCir, bitwiseANDExpression) star.

bitwiseXORExpressionNoIn =
bitwiseANDExpressionNoIn,(tCir, bitwiseANDExpressionNoIn) star.

bitwiseORExpression =
bitwiseXORExpression,(tPipe, bitwiseXORExpression) star.

bitwiseORExpressionNoIn =
bitwiseXORExpressionNoIn,(tPipe, bitwiseXORExpressionNoIn) star.

logicalAndExpression =
bitwiseORExpression, (tAmpAmp,bitwiseORExpression) star.

logicalAndExpressionNoIn =
bitwiseORExpressionNoIn, (tAmpAmp,bitwiseORExpressionNoIn) star.

logicalOrExpression =
logicalAndExpression, (tPipePipe,logicalAndExpression) star.

logicalOrExpressionNoIn =
logicalAndExpressionNoIn, (tPipePipe,logicalAndExpressionNoIn) star.

assignmentOperator =
tEq | tStarEq | tSlashEq | tModEq | tPlusEq | tMinusEq |
tleftShiftEq | tsRightShiftEq | trightShiftEq | tAmpEq | tPipeEq |
tCirEq.

conditionalExpression =
logicalOrExpression, (tQuestion, assignmentExpression,tColon,assignmentExpression) opt.

assignmentExpression =
conditionalExpression, (assignmentOperator,conditionalExpression) star.

conditionalExpressionNoIn =
logicalOrExpressionNoIn, (tQuestion, assignmentExpressionNoIn,tColon,assignmentExpressionNoIn) opt.

assignmentExpressionNoIn =
conditionalExpressionNoIn, (assignmentOperator,conditionalExpressionNoIn) star.

expression = assignmentExpression, (comma , assignmentExpression) star.

expressionNoIn = assignmentExpressionNoIn, (comma , assignmentExpressionNoIn) star.


statement = block | variableStatement | emptyStatement | expressionStatement |
ifStatement | iterationStatement | withStatement | switchStatement |
labelledStatement | tryStatement | throwStatement |
breakStatement | returnStatement.

block = leftbrace, statementList , rightbrace.

statementList = statement star.
variableStatement = var, variableDeclarationList, tSemicolon.

variableDeclarationList = (variableDeclaration plusSeparatedBy: comma).
variableDeclaration = identifier, (tEq,assignmentExpression) opt.

variableDeclarationListNoIn = (variableDeclarationNoIn plusSeparatedBy: comma).
variableDeclarationNoIn = identifier, (tEq,assignmentExpressionNoIn) opt.

emptyStatement = tSemicolon.

expressionStatement = (((function | leftbrace) not) & expression), tSemicolon.

ifStatement = if, leftparen,expression,rightparen,statement,(else,statement) opt.

iterationStatement = doStatement | forStatement | forStatementNoVar |
whileStatement | forInStatement | forInStatementNoVar.

doStatement = do, statement, while, leftparen,expression,rightparen,tSemicolon.

forStatement = for, leftparen,
var, variableDeclarationListNoIn ,
tSemicolon,
(expression opt),
tSemicolon,
(expression opt),
rightparen, statement.

forStatementNoVar = for, leftparen,
(expression opt),
tSemicolon,
(expression opt),
tSemicolon,
(expression opt),
rightparen, statement.


forInStatement = for, leftparen,
var, variableDeclarationListNoIn ,
in,
expression ,
rightparen, statement.

forInStatementNoVar = for, leftparen,
leftHandSideExpression ,
in,
expression ,
rightparen, statement.
whileStatement = while,leftparen,expression,rightparen,statement.

continueStatement = continue, (identifier opt), tSemicolon.

breakStatement = break, (identifier opt), tSemicolon.

returnStatement = return, (expression opt), tSemicolon.

withStatement = with, leftparen, expression ,rightparen, statement.

switchStatement = switch ,leftparen,expression,rightparen, clauseBlock.

clauseBlock = leftbrace,(clause star),(defaultClause opt),rightbrace.

clause = case, expression, tColon,statementList.

defaultClause = default,tColon, statementList.

labelledStatement = identifier,tColon,statement.

throwStatement = throw,expression,tSemicolon.

tryStatement = try, block, (catchBlock opt), (finallyBlock opt).

catchBlock = catch, leftparen, identifier,rightparen, block.

finallyBlock = finally, block.

functionDeclaration = function,identifier,
leftparen,formalParameterList,rightparen,
leftbrace,sourceElements,rightbrace.



sourceElements = (statement | functionDeclaration ) star.

program = sourceElements.
|
)

...


I really like the way you can separate the grammar from the AST creation (as described in the Executable Grammars[PDF] paper). As you can see there's no code specified for this purpose. Along with the source code there's is a 'testing AST' and a parser that inherits from this grammar which is used by the unit tests .

Almost all the grammar was written using the parser combinators from the library. Only charExceptFor: and accept: ifNotIn: were created.

There are still a lot work to do with this parser:

  1. Clean up the code
  2. Work on performance issues
  3. Find a solution for the "Automatic Semicolon Insertion" feature (see section 7.9) of the Ecma document)
  4. Get rid of some repetition (for example the 'NoVar' productions which are also present in the document)
  5. Better AST creation
  6. See if unicode support is possible
  7. More tests!


In order to see the result of using this parser I created a little program to display the 'testing AST'. For example:



The parser with tests and the other code mentioned can be found here .