Tuesday, March 3, 2009

Manipulating AVM2 byte code with F#

In this post I'm going to show an example of using AbcExplorationLib to manipulate simple AVM2 byte code (ActionScript). This example show how load a .ABC file and write it back to disk.

AbcExplorationLib is a library that will allow the manipulation of AVM2 Byte Code(described here). Although it's still incomplete, some basic examples work as the ones presented in this post .

The following ActionScript code will be compiled to byte code .


var i = 0;
for( i = 0;i < 10;i++) {
print("inside loop");
}
print("Done");


To generate the ".abc" file we type:

c:\test\> java -jar c:\flexsdk\lib\asc.jar test.as



Loading the compiled file



We're going to use the F# REPL(fsi.exe) to manipulate the file. We start by referencing the library.


> #r "abcexplorationlib.dll";;

--> Referenced 'C:\test\abcexplorationlib.dll'

> open Langexplr.Abc;;

Now we load the file:


> let abcFile = using (new System.IO.FileStream("test.abc",System.IO.FileMode.Open)) (
fun s -> AvmAbcFile.Create(s));;


Now abcFile contains the code of the compiled program.


> abcFile;;
val it : AvmAbcFile
= Langexplr.Abc.AvmAbcFile {Classes = [];
Scripts = [Langexplr.Abc.AvmScript];}




Inspecting the instructions



We're interested in the instructions of the top-level script for this .abc file. By typing the following expression we can get to this section:


> abcFile.Scripts.[0].InitMethod.Body.Value.Instructions;;
val it : AbcFileInstruction array
= [|GetLocal0; PushScope; PushByte 0uy; GetGlobalScope; Swap; SetSlot 1;
PushByte 0uy; GetGlobalScope; Swap; SetSlot 1;
Jump (SolvedReference "dest39"); ArtificialCodeBranchLabel "dest18"; Label
FindPropertyStrict
(MQualifiedName
([|Ns ("",CONSTANT_Namespace); Ns ("test.as$0",CONSTANT_PrivateNs)|],
"print")); PushString "inside loop";
CallProperty
(MQualifiedName
([|Ns ("",CONSTANT_Namespace); Ns ("test.as$0",CONSTANT_PrivateNs)|],
"print"),1); Pop; GetGlobalScope; GetSlot 1; Increment; SetLocal_2;
GetLocal2; GetGlobalScope; Swap; SetSlot 1; Kill 2;
ArtificialCodeBranchLabel "dest39"; GetGlobalScope; GetSlot 1;
PushByte 10uy; IfLt (SolvedReference "dest18");
FindPropertyStrict
(MQualifiedName
([|Ns ("",CONSTANT_Namespace); Ns ("test.as$0",CONSTANT_PrivateNs)|],
"print")); PushString "Done";
CallProperty
(MQualifiedName
([|Ns ("",CONSTANT_Namespace); Ns ("test.as$0",CONSTANT_PrivateNs)|],
"print"),1); CoerceA; SetLocal_1; GetLocal1; ReturnValue; Kill 1|]



We're going to define the following function to assist in the presentation of instruction listings.


> open Langexplr.Abc.InstructionPatterns;;
> let pr (i:AbcFileInstruction) =
- match i with
- | ArtificialCodeBranchLabel t -> printf "%s:\n" <| t.ToString()
- | i & UnsolvedSingleBranchInstruction(d,_) -> printf " %s %d\n" i.Name d
- | i & SolvedSingleBranchInstruction(l,_) -> printf " %s %s\n" i.Name l
- | _ -> printf " %s\n" i.Name;;

val pr : AbcFileInstruction -> unit


Now we can type:


> abcFile.Scripts.[0].InitMethod.Body.Value.Instructions |> Array.iter pr;;
getlocal_0
pushscope
pushbyte
getglobalscope
swap
setslot
pushbyte
getglobalscope
swap
setslot
jump dest39
dest18:
label
findpropertystrict
pushstring
callprop
pop
getglobalscope
getslot
increment
setlocal_2
getlocal_2
getglobalscope
swap
setslot
kill
dest39:
getglobalscope
getslot
pushbyte
iflt dest18
findpropertystrict
pushstring
callprop
coerce_a
setlocal_1
getlocal_1
returnvalue
kill
val it : unit = ()




A note on branch instructions



In order to make it easy to manipulate and analyze the code AbcExplorationLib adds a non-existing instruction called ArtificialCodeBranchLabel to mark the position where a branch instruction will jump. When these labels are generated the branch instructions are modified to point to the label's name instead of a relative byte offset. Details on how this process is briefly described in "Using F# Active Patterns to encapsulate complex conditions"

Converting from label references to byte offsets is also necessary to write code back to an .abc file. This process is performed by a function called ConvertSymbolicLabelsToByteReferences, for example:


> let c = AbcFileCreator();;

val c : AbcFileCreator

> abcFile.Scripts.[0].InitMethod.Body.Value.Instructions |>
- InstructionManipulation.ConvertSymbolicLabelsToByteReferences c |>
- Array.iter pr;;
getlocal_0
pushscope
pushbyte
getglobalscope
swap
setslot
pushbyte
getglobalscope
swap
setslot
jump 21
dest18:
label
findpropertystrict
pushstring
callprop
pop
getglobalscope
getslot
increment
setlocal_2
getlocal_2
getglobalscope
swap
setslot
kill
dest39:
getglobalscope
getslot
pushbyte
iflt -30
findpropertystrict
pushstring
callprop
coerce_a
setlocal_1
getlocal_1
returnvalue
kill
val it : unit = ()
>



Modifying the code



Values for branch instruction targets are adjusted if new code added, for example, lets add some code to print "Hola!" inside the loop.


> let printName = CQualifiedName(Ns("",NamespaceKind.CONSTANT_Namespace),"print"
- ) ;;

val printName : QualifiedName

> let printCode = [| FindPropertyStrict printName ;
- PushString "Hola!" ;
- CallProperty(printName,1);
- Pop |] ;;

val printCode : AbcFileInstruction array

> Seq.append instr.[0..16] <| Seq.append printCode instr.[17..] |>
- Seq.to_array |>
- InstructionManipulation.ConvertSymbolicLabelsToByteReferences c |>
- Array.iter pr;;
getlocal_0
pushscope
pushbyte
getglobalscope
swap
setslot
pushbyte
getglobalscope
swap
setslot
jump 29
dest18:
label
findpropertystrict
pushstring
callprop
pop
findpropertystrict
pushstring
callprop
pop

getglobalscope
getslot
increment
setlocal_2
getlocal_2
getglobalscope
swap
setslot
kill
dest39:
getglobalscope
getslot
pushbyte
iflt -38
findpropertystrict
pushstring
callprop
coerce_a
setlocal_1
getlocal_1
returnvalue
kill
val it : unit = ()



Writing the new file



We can write this code back to a .abc file by doing this:


> let newCode = Seq.append instr.[0..16] <| Seq.append printCode instr.[17..] |
- > Seq.to_array;;

val newCode : AbcFileInstruction array

> let newBody = AvmMethodBody(oldbody.Method,
- oldbody.MaxStack,
- oldbody.LocalCount,
- oldbody.InitScopeDepth,
- oldbody.MaxScopeDepth,
- newCode,
- oldbody.Exceptions,
- oldbody.Traits);;

val newBody : AvmMethodBody

> let newFile = AvmAbcFile( [AvmScript( abcFile.Scripts.[0].InitMethod.CloneWithBody(newBody), abcFile.Scripts.[0].Members)], []);;

val newFile : AvmAbcFile
> open System.IO;;
> let c = AbcFileCreator();;

val c : AbcFileCreator

>
- using (new BinaryWriter(new FileStream("test_modified.abc",FileMode.Create)))
-
- (fun f -> let file = newFile.ToLowerIr(c) in file.WriteTo(f));;
val it : unit = ()


Running this program using Tamarin shows:


c:\test\>avmplus_sd.exe test_modified.abc
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
inside loop
Hola!
Done



The AbcExplorationLib library is still pretty incomplete. Also there's a lot to improve, for example name handling and instruction modification. Future posts will present new features/experiments.