EnFuzion Tutorial

A Random Walk Simulation

Here we present a simple random walk application that demonstrates the use of EnFuzion. The random walk is one of the basic methods used in financial and scientific calculations.

Consider the following question. We repeatedly throw a coin. If it comes up tails, we move one step to the left. If it comes up heads, we move one step to the right. What is the maximum distance from the starting point that we will reach?

We model the simulation with a program that takes two parameters:

  • number of steps in each walk

  • number of random walks

The program, called max_distance, takes these parameters from the command line, and performs a simulation. It executes the required number of walks, each with the required number of steps and calculates the maximum distance from the start that is reached by a step. At the end, max_distance prints the values of the input parameters and the maximum distance traveled to standard output. For example, the following command returns the maximum distance from the start for 10 random walks with 100 steps in each walk.


% max_distance 100 10
steps 100, iterations 10, maximum distance 21
%
  

Examples of max_distance written in shell, Perl and Windows 2000/XP batch are provided below. Although shell programs are much slower than programs in a compiled language and would not be used in practical applications, they serve as a good example for purposes of demonstration.

Here is an example of max_distance written in shell.

Example:

#!/bin/sh
#
#       simulate random walk in one dimension
#       usage: max_distance.sh <nr_of_steps> <nr_of_iterations>
#       print out maximum distance from the start
#

distance=0

#
#       each iteration is one random walk
#
iter=0
while [ $iter -lt $2 ]; do

        position=0

#
#       each iteration is one step in the walk
#
        step=0
        while [ $step -lt $1 ]; do

#
#               perform left or right move
#
                if [ $RANDOM -lt 16384 ]; then
                        position=$((position+1));
                else
                        position=$((position-1));
                fi
                step=$(($step+1))
                #echo $step $position

#
#               update distance from the start
#
                if [ $position -gt $distance ]; then 
                        distance=$position
                fi
                if [ $((-position)) -gt $distance ]; then 
                        distance=$((-position));
                fi
	
                #echo $distance
        done

        iter=$(($iter+1))
done

#
#       print out maximum distance
#
echo steps $1,  iterations $2,  maximum distance $distance

And here is an example of max_distance written in Perl.

Example:

#!/usr/bin/perl -w
#
#       simulate random walk in one dimension
#       usage: max_distance.pl <nr_of_steps> <nr_of_iterations>
#       print out maximum distance from the start
#

my $nsteps = shift @ARGV;
my $niter = shift @ARGV;

my $distance = 0;
srand();

#
#       each iteration is one random walk
#
my $iter = 0;
while ($iter < $niter) {
        my $position = 0;
#
#       each iteration is one step in the walk
#
        my $step = 0;
        while ($step < $nsteps) {
                #
                #       perform left or right move
                #
                if (rand(1.0) < 0.5) {
                        $position++;
                } else {
                        $position--;
                }
                $step++;

#
#               update distance from the start
#
                if (abs($position) > $distance) {
                        $distance = abs($position);
                }
        }
        $iter++;
}

print "steps: $nsteps, iterations: $niter, maximum distance: $distance\n";

Here is an example of a max_distance batch file written for Windows 2000/XP.

Example:

:
:       simulate random walk in one dimension
:       usage: max_distance <nr_of_steps> <nr_of_iterations>
:       print out maximum distance from the start
:
@echo off
set distance=0
:
:       each iteration is one random walk
:
FOR /L %%i IN (1,1,%2) do call :FIRSTLOOP %1
:
:       print out maximum distance
:
echo steps %1,  iterations %2,  maximum distance %distance%
goto :EOF

::::::::::::::::::::::::::::::
: Subroutines
::::::::::::::::::::::::::::::

:FIRSTLOOP
:
:       each iteration is one step in the walk
:
        set position=0
        FOR /L %%j IN (1,1,%1) do call :SECONDLOOP %%j
goto :EOF

:SECONDLOOP
:
:       perform left or right move
:
        set temp=%RANDOM%
        if %temp% LSS 16384 set /a position=position+1 
        if %temp% GEQ 16384 set /a position=position-1
:       echo %1 %position%
:
:       update distance from the start
:
        set temp=%position%
        if %temp% LSS 0 set /a temp=0-temp
        if %temp% GTR %distance% set /a distance=temp
:       echo %distance%
goto :EOF

Using this model of the random walk, we wish to explore the behavior of the system as the key parameters are varied. For our example, we select the following input values:

  • 10 values for the number of steps, ranging from 10 to 100

  • 10 values for the number of walks, ranging from 10 to 100

The results are collected in a file. The file can be later visualized with any available visualization tool. (Not discussed here.)

How would we simulate the problem without EnFuzion?

Our simple example requires 10 jobs. A common way of running the simulation is to set up a command file which runs the program across all of the parameter values. For example, the following commands run the program across three different parameter values and send the output to three different files:


max_distance 10 10 > output.1
max_distance 20 10 > output.2
max_distance 30 10 > output.3
  

On Unix, the output files are then concatenated using the following command:


% cat output.* >final_output
  

This approach can be time consuming and error prone for a large number of jobs, and it fails to harness the power of multiple computers. The following sections describe how to solve the problem with EnFuzion.

Overview of the simulation

The simulation requires repeated execution of the same program across a range of parameter values. To speed up the simulation, we use remote nodes to carry out the execution. Thus, the main simulation phases are:

  • Copying the max_distance program to all nodes

  • Executing the max_distance program across its parameter range on the nodes, possibly executing several instances concurrently

  • Storing output from each program instance in a local file on the node

  • Copying each output file from the node to the root host

  • Deleting all files on the nodes

  • Concatenating all output files

The main simulation phases and the description of the jobs are specified in a run file. Run files are constructed by users and used by EnFuzion to specify runs and perform the simulation. A run file for our example is described below.

Setting up a run file for Linux/Unix environment

A run file is a plain text file containing a description of a single run. Run files can be composed in a text editor or generated by a program.

A run consists of many jobs. All jobs in one run perform the same tasks, usually with different input parameters. A run file thus contains two major components, tasks with commands to execute and parameters for each job.

Task descriptions

Each run has its own tasks. Tasks are simply a list of commands that are executed by EnFuzion to run a job. Tasks are shared by all jobs in the run. These are the task descriptions required by our example:


# node initialization
task nodestart
        # copy executable from the root directory to remote node,
        copy -t max_distance* node:.

        # set execution permission on unix node
        if ($ENFOS != WindowsNT) then
          node:execute chmod 0755 max_distance
        endif
endtask
  

# individual jobs
task main
        # execute the simulation with corresponding parameter values for $steps and $iterations
        node:execute ./max_distance $steps $iterations >output
        # copy the remote output file to a unique file on the root host
        copy node:output output.$ENFJOBNAME
endtask
  

# run finalization
task rootfinish
        # concatenate output files
        if ($ENFOS == WindowsNT) then
          execute cmd /c copy output.* final_output
        else
          execute cat output.* >final_output
        endif

        # remove temporary files on the root host
        # files on nodes are removed automatically
        if ($ENFOS == WindowsNT) then
          execute cmd /c del output.*
        else
          execute rm -f output.*
        endif
endtask
  

Task descriptions start with the keyword 'task' followed by a task name, contain a list of commands and end with the keyword 'endtask'. EnFuzion provides simple commands to describe job execution. Two of the most common EnFuzion commands are copy, which copies files, and execute, which executes programs. Both commands can work on the root or on node hosts. By default, the location is the root host. If "node:" is placed before an EnFuzion command or a file name, the node host is used instead of the root host.

There are three tasks in our example: nodestart, main and rootfinish. They are executed during the node initialization, for each job, and at the end of the run, respectively.

Task nodestart copies the max_distance executable to the remote nodes and sets its execution permissions. It assumes that all nodes are on Linux or Unix platforms. It is performed only once for each node, during the node initialization phase, and not for each individual job.

Task main executes for each job. It runs the max_distance simulation. $steps and $iterations are replaced with specific parameter values for the job. They are supplied to max_distance through the command line. Output is stored in file output. At the end, file output is copied from a remote host to a unique file on the root host.

Task rootfinish executes after all the jobs have completed. It concatenates the results and deletes temporary files on the root host. All files on the nodes are removed automatically after the jobs complete.

Job parameters

Job parameters are specified with the keyword 'job', followed by the number of parameters and parameter values. The following are job descriptions for our example, which has 100 jobs with parameter steps ranging from 10 to 100 and parameter iterations ranging from 10 to 100:


job parameters 2 steps 10 iterations 10;
job parameters 2 steps 20 iterations 10;
...
job parameters 2 steps 100 iterations 10;
job parameters 2 steps 10 iterations 20;
job parameters 2 steps 20 iterations 20;
...
job parameters 2 steps 100 iterations 20;
...
job parameters 2 steps 10 iterations 100;
job parameters 2 steps 20 iterations 100;
...
job parameters 2 steps 100 iterations 100;
  
Run file example for Linux/Unix environment

The complete run file with task descriptions and job parameters is shown below.

Example:

# node initialization
task nodestart
        # copy executable from the root directory to remote node,
        copy -t max_distance* node:.

        # set execution permission on unix node
        if ($ENFOS != WindowsNT) then
          node:execute chmod 0755 max_distance
        endif
endtask

# individual jobs
task main
        # execute the simulation with corresponding parameter values for $steps and $iterations
        node:execute ./max_distance $steps $iterations >output
        # copy the remote output file to a unique file on the root host
        copy node:output output.$ENFJOBNAME
endtask

# run finalization
task rootfinish
        # concatenate output files
        if ($ENFOS == WindowsNT) then
          execute cmd /c copy output.* final_output
        else
          execute cat output.* >final_output
        endif

        # remove temporary files on the root host
        # files on nodes are removed automatically
        if ($ENFOS == WindowsNT) then
          execute cmd /c del output.*
        else
          execute rm -f output.*
        endif
endtask

job parameters 2 steps 10 iterations 10;
job parameters 2 steps 10 iterations 20;
job parameters 2 steps 10 iterations 30;
job parameters 2 steps 10 iterations 40;
job parameters 2 steps 10 iterations 50;
job parameters 2 steps 10 iterations 60;
job parameters 2 steps 10 iterations 70;
job parameters 2 steps 10 iterations 80;
job parameters 2 steps 10 iterations 90;
job parameters 2 steps 10 iterations 100;
job parameters 2 steps 20 iterations 10;
job parameters 2 steps 20 iterations 20;
job parameters 2 steps 20 iterations 30;
job parameters 2 steps 20 iterations 40;
job parameters 2 steps 20 iterations 50;
job parameters 2 steps 20 iterations 60;
job parameters 2 steps 20 iterations 70;
job parameters 2 steps 20 iterations 80;
job parameters 2 steps 20 iterations 90;
job parameters 2 steps 20 iterations 100;
job parameters 2 steps 30 iterations 10;
job parameters 2 steps 30 iterations 20;
job parameters 2 steps 30 iterations 30;
job parameters 2 steps 30 iterations 40;
job parameters 2 steps 30 iterations 50;
job parameters 2 steps 30 iterations 60;
job parameters 2 steps 30 iterations 70;
job parameters 2 steps 30 iterations 80;
job parameters 2 steps 30 iterations 90;
job parameters 2 steps 30 iterations 100;
job parameters 2 steps 40 iterations 10;
job parameters 2 steps 40 iterations 20;
job parameters 2 steps 40 iterations 30;
job parameters 2 steps 40 iterations 40;
job parameters 2 steps 40 iterations 50;
job parameters 2 steps 40 iterations 60;
job parameters 2 steps 40 iterations 70;
job parameters 2 steps 40 iterations 80;
job parameters 2 steps 40 iterations 90;
job parameters 2 steps 40 iterations 100;
job parameters 2 steps 50 iterations 10;
job parameters 2 steps 50 iterations 20;
job parameters 2 steps 50 iterations 30;
job parameters 2 steps 50 iterations 40;
job parameters 2 steps 50 iterations 50;
job parameters 2 steps 50 iterations 60;
job parameters 2 steps 50 iterations 70;
job parameters 2 steps 50 iterations 80;
job parameters 2 steps 50 iterations 90;
job parameters 2 steps 50 iterations 100;
job parameters 2 steps 60 iterations 10;
job parameters 2 steps 60 iterations 20;
job parameters 2 steps 60 iterations 30;
job parameters 2 steps 60 iterations 40;
job parameters 2 steps 60 iterations 50;
job parameters 2 steps 60 iterations 60;
job parameters 2 steps 60 iterations 70;
job parameters 2 steps 60 iterations 80;
job parameters 2 steps 60 iterations 90;
job parameters 2 steps 60 iterations 100;
job parameters 2 steps 70 iterations 10;
job parameters 2 steps 70 iterations 20;
job parameters 2 steps 70 iterations 30;
job parameters 2 steps 70 iterations 40;
job parameters 2 steps 70 iterations 50;
job parameters 2 steps 70 iterations 60;
job parameters 2 steps 70 iterations 70;
job parameters 2 steps 70 iterations 80;
job parameters 2 steps 70 iterations 90;
job parameters 2 steps 70 iterations 100;
job parameters 2 steps 80 iterations 10;
job parameters 2 steps 80 iterations 20;
job parameters 2 steps 80 iterations 30;
job parameters 2 steps 80 iterations 40;
job parameters 2 steps 80 iterations 50;
job parameters 2 steps 80 iterations 60;
job parameters 2 steps 80 iterations 70;
job parameters 2 steps 80 iterations 80;
job parameters 2 steps 80 iterations 90;
job parameters 2 steps 80 iterations 100;
job parameters 2 steps 90 iterations 10;
job parameters 2 steps 90 iterations 20;
job parameters 2 steps 90 iterations 30;
job parameters 2 steps 90 iterations 40;
job parameters 2 steps 90 iterations 50;
job parameters 2 steps 90 iterations 60;
job parameters 2 steps 90 iterations 70;
job parameters 2 steps 90 iterations 80;
job parameters 2 steps 90 iterations 90;
job parameters 2 steps 90 iterations 100;
job parameters 2 steps 100 iterations 10;
job parameters 2 steps 100 iterations 20;
job parameters 2 steps 100 iterations 30;
job parameters 2 steps 100 iterations 40;
job parameters 2 steps 100 iterations 50;
job parameters 2 steps 100 iterations 60;
job parameters 2 steps 100 iterations 70;
job parameters 2 steps 100 iterations 80;
job parameters 2 steps 100 iterations 90;
job parameters 2 steps 100 iterations 100;

 


©2002 Axceleon, Inc.