Friday, December 28, 2007

A quick look at SNOBOL

This entry is just a brief glimpse at the SNOBOL programming language. From its Wikipedia entry:


SNOBOL (String Oriented Symbolic Language) is a computer programming language developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph E. Griswold and Ivan P. Polonsky.


SNOBOL is a language for string manipulations. Also from its Wikipedia entry:


... SNOBOL was widely used in the 1970s and 1980s as a text manipulation language ... its popularity has faded as newer languages such as Awk and Perl have made string manipulation by means of regular expressions popular ...



This language caught my attention while listening to the OOPSLA podcast episode on the excellent 50-in-50 talk by Guy Steele and Richard Gabriel.

Given that this a programming language exploration blog, learning more about this language provide an excellent opportunity to know more about the first languages for text manipulation.

The best place to learn about the language is http://www.snobol4.org/ a lot of SNOBOL resources can be found there. One of the best resources is a link to the THE SNOBOL4 PROGRAMMING LANGUAGE (Green Book).

All the examples presented in this post were created using the Macro SNOBOL4 in C implementation.

A "Hello world" program in SNOBOL looks like this:


OUTPUT = 'Hello World'
END


As shown here, the assignment to the special OUTPUT variable outputs the value to the standard output.

The inverse is also true for the INPUT variable. For example the following program asks the name of the user.


OUTPUT = "Your name? "
NAME = INPUT
OUTPUT = "Hello " NAME
END


Flow of control is given by jumps to labels given the successful execution of a statement. For example:


ASK
OUTPUT = "Your name? "
NAME = INPUT :F(DONE)
OUTPUT = "Hello " NAME :(ASK)
DONE
OUTPUT = "Finished"
END


This example asks for a name until the input is closed, that is end of file or Ctrl+D (in Linux). The ASK,DONE and END elements are labels; all of them (except for END) are user specified names. The :F(DONE) modifier means jump to DONE if failed and the :(ASK) modifier means jump to ASK.

The most interesting thing about the language is the string pattern matching capabilities. Here's an small(and very incomplete) example that extracts the parts of a simplified URL string:


LETTER = "abcdefghijklmnopqrstuvwxyz"
LETTERORDOT = "." LETTER
LETTERORSLASH = "/" LETTER

LINE = INPUT
LINE SPAN(LETTER) . PROTO "://" SPAN(LETTERORDOT) . HOST "/" SPAN(LETTERORSLASH) . RES

OUTPUT = PROTO
OUTPUT = HOST
OUTPUT = RES
END



In line 6, the contents of the LINE variable is matched against a pattern. The pattern contains the following elements:


  1. The SPAN(LETTER) . PROTO "://" section says identify a sequence of letters followed by "://" and assign them to the variable called PROTO

  2. The SPAN(LETTERORDOT) . HOST "/" secotion says take a sequence of letters and dots followed by "/" and assign then to the variable called HOST

  3. Finally the last section takes the remaining letters and slash characters and assign them to the RES variable




To show a litte program that uses all the elements presented here, I wanted to create a small example that takes as input the authentication /var/log/auth.log and shows all the uses of sudo and the program that was executed. The desired lines look like this:


Dec 28 08:21:42 glorfindel sudo: lfallas : TTY=pts/3 ; PWD=/home/lfallas ; USER=root ; COMMAND=/bin/bash


This file also contains entries other than sudo usages, so we have to ignore them.

Heres the program:


&ANCHOR = 0
UCASE = "ABCDEFGHIJLKMNOPQRSTUVWXYZ"
LCASE = "abcdefghijlkmnopqrstuvwxyz"
DIGIT = "0123456789"
APATH = SPAN(DIGIT)
USERNAMECHAR = DIGIT LCASE UCASE
USERNAMEPAT = SPAN(USERNAMECHAR)

READLINE LINE = INPUT :F(DONE)
LINE " sudo: " USERNAMEPAT . USER :F(READLINE)
LINE "COMMAND=" ARB . COMMAND RPOS(0)

OUTPUT = USER ":" COMMAND :(READLINE)

DONE

END


Here the &ANCHOR assignment tells SNOBOL that pattern matching is performed at any position of the specified string. The ARB element says any character before the next pattern succeeds and the RPOS(0) element is used to identify the end of line.



For future entries I'm going to show more interesting SNOBOL features.

Thursday, December 20, 2007

Combining iterators in Java

In this post I'm going to show a little experiment for combining iterators in Java.

A couple of week a ago I read an excellent blog entry by Debasish Ghosh called
Infinite Streams using Java Closures. In it an implementation of the Infinite Stream concept is presented with the help of Java closures.

For this post I wanted to show something related to this but using the Iterable/Iterator interfaces instead of a custom stream object. What interest me the most is the operations applied to stream objects such as map or filter but on iterators. These and other operations are defined for the IEnumerable/IEnumerator interfaces in C# with the extension methods defined in System.Linq.Enumerable.

Also, given the amount of discussion lately around Java closures I wanted to create this code using the Java closures prototype of the BGGA closures proposal to learn about it.

What I want to have is the following operations on iterators:


import java.util.*;

public class IteratorUtils {
public static <X,V> Iterable<V> map({X=>V} f,Iterable<X> i) {
return new EachElementIterable<X,V>(i,f);
}
public static <X> Iterable<X> take(int count,Iterable<X> i) {
return new TakeIterable<X>(i,count);
}
public static <X> Iterable<X> filter({X=>boolean} pred,Iterable<X> i) {
return new FilterIterable<X>(i,pred);
}
public static <X> Iterable<X> flatten(Iterable<Iterable<X>> multiIterator) {
return new FlattenIterable<X>(multiIterator);
}
}


The map operation takes a function and a iterator and generate another iterator with the function applied to each element. The take operation generates an iterator for a number of elements on another iterator. The filter operation takes a predicate and a iterator and returns another iterator with only the elements that applied to the predicate. The flatten operation takes an iterator of iterators and returns a flattened sequence of values.

Heres the iterator for the map operation:


import java.util.*;
public class EachElementIterable<T,J> implements Iterable<J>{
private Iterable<T> iterable;
private {T=>J} function;
public EachElementIterable(Iterable<T> iterable,{T=>J} function) {
this.iterable = iterable;
this.function = function;
}
public Iterator<J> iterator() {
return new EachElementIterator<T,J>(iterable.iterator(),function);
}

static class EachElementIterator<T,J> implements Iterator<J>{
private Iterator<T> iterator;
private {T=>J} function;
public EachElementIterator(Iterator<T> iterator,{T=>J} function) {
this.function = function;
this.iterator = iterator;
}
public J next() {
return function.invoke(iterator.next());
}
public boolean hasNext() {
return iterator.hasNext();
}
public void remove() {
}
}
}


The implementation is not as elegant as the one presented in the Infinite Stream entry, but it can be used with anything implementing Iterable.

Heres the iterator for the take operation:


import java.util.*;

public class TakeIterable<T> implements Iterable<T> {
private int count;
private Iterable<T> iterable;
public TakeIterable(Iterable<T> iterable,int count) {
this.count = count;
this.iterable = iterable;
}
public Iterator<T> iterator() {
return new TakeIterator<T>(iterable.iterator(),count);
}

static class TakeIterator<T> implements Iterator<T> {
private Iterator<T> iterator;
private int count;
private int current;
public TakeIterator(Iterator<T> iterator,int count) {
this.count = count;
this.iterator = iterator;
this.current = 0;
}
public T next() {
if (current++ < count) {
return iterator.next();
} else{
return null;
}
}
public boolean hasNext() {
return iterator.hasNext() && current < count;
}
public void remove() {
}
}

}


Here's the implementation of the filter iterator:


import java.util.*;

public class FilterIterable<T> implements Iterable<T> {
private Iterable<T> iterable;
private {T=>boolean} predicate;
public FilterIterable(Iterable<T> iterable,{T=>boolean} predicate) {
this.iterable = iterable;
this.predicate = predicate;
}
public Iterator<T> iterator() {
return new FilterIterator<T>(iterable.iterator(),predicate);
}

static class FilterIterator<T> implements Iterator<T> {
private Iterator<T> iterator;
private {T=>boolean} predicate;
private T currentValue;
boolean finished = false;
boolean nextConsumed = true;

public FilterIterator(Iterator<T> iterator,{T=>boolean} predicate) {
this.predicate = predicate;
this.iterator = iterator;
}

public boolean moveToNextValid() {
boolean found = false;
while(!found && iterator.hasNext()) {
T currentValue = iterator.next();
if(predicate.invoke(currentValue)) {
found = true;
this.currentValue = currentValue;
nextConsumed = false;
}
}
if(!found) {
finished = true;
}
return found;
}

public T next() {
if (!nextConsumed) {
nextConsumed = true;
return currentValue;
} else {
if (!finished) {
if(moveToNextValid()) {
nextConsumed = true;
return currentValue;
}
}
}
return null;
}
public boolean hasNext() {
return !finished &&
(!nextConsumed || moveToNextValid());
}
public void remove() {
}
}
}


And finally, here's the flatten iterator:


public class FlattenIterable<T> implements Iterable<T> {
private Iterable<Iterable<T>> iterable;
public FlattenIterable(Iterable<Iterable<T>> iterable) {
this.iterable = iterable;
}
public Iterator<T> iterator() {
return new FlattenIterator<T>(iterable.iterator());
}
static class FlattenIterator<T> implements Iterator<T> {
private Iterator<Iterable<T>> iterator;
private Iterator<T> currentIterator;
public FlattenIterator(Iterator<Iterable<T>> iterator) {
this.iterator = iterator;
currentIterator = null;
}
public boolean hasNext() {
boolean hasNext = true;
if (currentIterator == null) {
if (iterator.hasNext()) {
currentIterator = iterator.next().iterator();
} else {
return false;
}
}

while(!currentIterator.hasNext() &&
iterator.hasNext()) {
currentIterator = iterator.next().iterator();
}

return currentIterator.hasNext();
}

public T next() {
return currentIterator.next();
}

public void remove() {
}
}
}


We can combine these wrapper iterators to create operators on iterators of any kind for example:


ArrayList<String> anArray = new ArrayList<String>();
anArray.add("foo");
anArray.add("goo");
anArray.add("zoo");
...

for(String s : map({String s => "<li>"+s+"</li>"},
filter({String s => !s.startsWith("g")},
anArray))) {
System.out.println(s);
}


This example prints the "foo" and "zoo" between <li> and </li> tags. Also here I'm using Java static import to avoid writing IteratorUtils in every operation call.

Returning to the infinite stream operations, for me the most interesting thing about these wrapper iterators is that an iterator with a large amout of elements could be manipulated without consuming all of its elements in each step. For example take the following iterator:


public class NumericSequence implements Iterable<Integer>,Iterator<Integer>{
private int value = 0;
public NumericSequence() {
}
public NumericSequence(int start) {
value = start;
}
public Iterator<Integer> iterator() {
return this;
}
public Integer next() {
return value++;
}
public boolean hasNext() {
return true;
}
public void remove() {
}
}


We can manipulate this iterator using the operations and wrappers defined above:


for(int i : take(5,
filter(
{Integer x => x % 2 != 0 },
map( {Integer x=>x*x},
new NumericSequence())))) {
System.out.println(i);
}


This program prints only 1,9,25,49 and 81.

For the final example I wanted to create a little snippet that prints a list of friendly pairs of numbers. A Friendly Pair is a pair of numbers that share the common characteristic of having the same value as a result of the sum of all of its divisors divided by itself. Another nice definition of a Friendly number is presented here.

First we define some utilities:


public class IntegerIteratorUtils {
public static int sum(Iterable<Integer> integerIterable) {
int result = 0;
for(Integer i : integerIterable) {
result += i.intValue();
}
return result;
}
public static Iterable<Integer> divisors(int number) {
return
IteratorUtils.filter(
{Integer x => x != 0 && number % x == 0 },
IteratorUtils.take(number+1,new NumericSequence()));
}

public static int divisorFunction(int number) {
return sum(divisors(number));
}
}



Having these functions we can now calculate the pairs:


int max = 1000;
for(Pair<Integer,Integer> p :
filter( {Pair<Integer,Integer> p =>
divisorFunction(p.first)/(double)p.first ==
divisorFunction(p.second)/(double)p.second },
flatten(
map(
{Integer x =>
map({Integer ix => new Pair<Integer,Integer>(x,ix)},
take(max,new NumericSequence(x+1)))},
take(max,
new NumericSequence(1)))))) {

System.out.println("("+p.first.toString()+", "+p.second.toString()+")");
}


This program prints:


(6, 28)
(6, 496)
(12, 234)
(28, 496)
(30, 140)
(40, 224)
(66, 308)
(78, 364)
(80, 200)
(84, 270)
....


Code for this example was created with the closures prototype from 2007-11-30. Source can be found here.

Tuesday, December 11, 2007

Creating fractal images with C# 3.0 features and Parallel Extensions

In this post I'm going to show a small example of using the Parallel Extensions Library to improve a program that renders a escape time fractal image.

A couple of months ago I wrote about a small program that creates an image with the escape time algorithm using C# 3.0 features . For this post I'm going to change the implementation to use the Parallel Extensions Library.

Along the the library documentation and MSDN articles, the blog entries from the Parallel Extensions Team blog provided a nice introduction and guidance . The original post was inspired by Luke Hoban's blog entry A Ray Tracer in C#3.0 .

The escape time algorithm is very simple and also considered a embarrassingly parallelproblem. For previous posts I did a similar experiment of using Fortress parallel for loops to render the Mandelbrot Set fractal.

The code for the original program was the following:


void CreateFractal(double minX, double minY,double maxX, double maxY,int imageWidth, int imageHeight)
{
Func<double, double> xF = MathUtils.InterpFunc(0, minX, imageWidth, maxX);
Func<double, double> yF = MathUtils.InterpFunc(0, minY, imageHeight, maxY);

foreach (var p in from yi in Enumerable.Range(0, imageHeight)
from xi in Enumerable.Range(0, imageWidth)
select new
{
x = xi,
y = yi,
xD = xF(xi),
yD = yF(yi)
})
{

Complex p0 = new Complex(p.xD, p.yD);
Func<Complex, Complex> function = functionConstructor(p0);

int i = ApplyFunction(function, p0)
.TakeWhile(
(x, j) => j < maxIteration && x.NormSquared() < 4.0)
.Count();

HandlePixel(p.x, p.y, i);
}

}



We can change this code in several ways to take advantage of the library. Here I'm going to present two of them.

Write one big LINQ expression

The code for the generation of pixel data is coded in a single LINQ expression.

Func<double, double> xF = MathUtils.InterpFunc(0, minX, imageWidth, maxX);
Func<double, double> yF = MathUtils.InterpFunc(0, minY, imageHeight, maxY);

foreach (var p in from yi in Enumerable.Range(0, imageHeight).AsParallel()
from xi in Enumerable.Range(0, imageWidth)
let mappedX = xF(xi)
let mappedY = yF(yi)
let p0 = new Complex(xF(xi), yF(yi))
let function = functionConstructor(p0)
select new
{
x = xi,
y = yi,
xD = mappedX,
yD = mappedY,
i = ApplyFunction(function, p0)
.TakeWhile(
(x, j) => j < maxIteration && x.NormSquared() < 4.0)
.Count()
})
{
HandlePixel(p.x, p.y, p.i);
}


Elements from the body of the foreach loop were moved to the LINQ expression by using the let keyword.

The AsParallel extension method called does the magic of distributing the work for calculating the lines of the image. A nice discussion on were to put this call can be found in Parallelizing a query with multiple “from” clauses and Chunk partitioning vs range partitioning in PLINQ.

Although this problem is much more simpler than Ray tracing, this alternative tries to follow the same approach as: Taking LINQ to Objects to Extremes: A fully LINQified RayTracer.



Add a Parallel.For loop

The other alternative is to replace one of the from elements with a Parallel.For loop.


Func<double, double> xF = MathUtils.InterpFunc(0, minX, imageWidth, maxX);
Func<double, double> yF = MathUtils.InterpFunc(0, minY, imageHeight, maxY);

Parallel.For(0, imageHeight , delegate(int yi)
{
foreach (var p in from xi in Enumerable.Range(0, imageWidth)
let mappedX = xF(xi)
let mappedY = yF(yi)
let p0 = new Complex(xF(xi), yF(yi))
let function = functionConstructor(p0)
select new
{
x = xi,
y = yi,
xD = mappedX,
yD = mappedY,
i = ApplyFunction(function, p0)
.TakeWhile(
(x, j) => j < maxIteration && x.NormSquared() < 4.0)
.Count()
})
{
lock (this)
{
HandlePixel(p.x, p.y, p.i);
}
}
});



A lock needed to be added because HandlePixel calls Bitmap.SetPixel which needs to be protected.

Results

I tested this code by generating a 2000 by 2000 image of a small section of the Mandelbrot fractal located between (-1.1752491998171,0.223337905807042) and (-1.17342021033379,0.225166895290352) with a escape time of 512. A reduced image of this location looks like this:

Calculated Fractal Image

By using the library the program took about 40% less of the time of original version on my dual core machine. The performance for the two approaches presented above was very similar.

The nicest thing about this experiment is that the code required just a couple of modifications in order to take advantage of the other CPU.

Another approach for rendering this fractal using Parallel extensions is from Jon Skeet's Coding Blog : LINQ to Silliness: Generating a Mandelbrot with parallel potential and A cautionary parallel tale: ordering isn't simple.

Code for this post can be found here.

Tuesday, December 4, 2007

Improving code using more Scala features

A couple of days ago Eric Willigers gave me some nice feedback on how to improve the code of the quick fix created for the previous post.

Before modifying the document we are required to lock it and then release it. I did it like this:


...
doc.atomicLock
doc.replace(ifRange.getStart,ifRange.getLength,resultingStatement,null)
doc.atomicUnlock
...


However this is not safe since the replace method could throw a BadLocationException. One of the things we could do is to add a try/catch/finally block. However Eric pointed me to a much nicer way to do , by using two techniques called:



The Loan pattern is a way to ensure resource disposal when the control leaves some scope. For this propose closures are used.

The Pimp my library pattern, based on a Martin Odersky’s article, provides a way to extend existing class libraries with new operations without recompiling. New operations are added by defining a wrapper for specific classes. In this case we need to apply the Loan pattern to the BaseDocument class


class RichDocument(doc : BaseDocument) {
def withLock[T](f : BaseDocument => T ) = {
try {
doc.atomicLock()

f(doc)

} catch {
case ble : BadLocationException =>
Exceptions.printStackTrace(ble)
} finally {
doc.atomicUnlock()
}
}
}


This class defines our wrapper, the f is an anonymous function with code that will modify the document. Now we can add this functionality to the BaseDocument class by using the following declaration:


object DocumentImplicits {
implicit def toRichDocument(doc : BaseDocument) = new RichDocument(doc)
}



Putting using these definitions in the fix code leaves the code like this:


...
doc.withLock(_.replace(ifRange.getStart,
ifRange.getLength,
resultingStatement,
null))
...


The "_.replace ..." syntax is interesting way of specifying an anonymous function with one argument. For more information on this see Placeholder Syntax for Anonymous Functions in section 6.23 of the Scala Language Specification.


Another thing that could be done using this pattern is to eliminate direct uses of the AstUtilities class. By doing this we can add the following definitions:


object AstNodeImplicits {
implicit def toRichAstNode(node : Node) = new RichAstNode(node)
}

class RichAstNode(node : Node) {
def getRange() = AstUtilities.getRange(node)
}


Before using this implicits we need to import them:


import org.langexplr.nbhints.utils.DocumentImplicits.toRichDocument
import org.langexplr.nbhints.utils.AstNodeImplicits.toRichAstNode


The final code for the fix method is the following:


def implement() = {
val doc = info.getDocument.asInstanceOf[BaseDocument]
val conditionRange = condition.getRange()
val conditionText = getConditionText(doc,conditionRange)
val statementRange = statement.getRange()
val statementText = doc.getText(statementRange.getStart,
statementRange.getLength)
val ifRange = ifNode.getRange()

val resultingStatement = statementText + getModifierText + conditionText

doc.withLock(_.replace(ifRange.getStart,
ifRange.getLength,
resultingStatement,
null))
}


Code for this example can be found here (Now updated to the released Netbeans 6.0 ) .