MONTE CARLO SIMULATION USING MATLAB
Introduction
Monte Carlo simulation is used to investigate the finite sample performance of estimators and test statistics. The computation tasks involve the repetition of statistical procedures given randomly generated data under a variety of parameter values, while simulation tasks are mutually independent. The HPC cluster could enhance the productivity of the Monte Carlo by running multiple independent jobs simultaneously.
data:image/s3,"s3://crabby-images/0f7b6/0f7b66c27af588e510b0d0d8d6adc69ea3a20cc6" alt="MonteCarlo_matlab MonteCarlo_matlab"
This article explains how to conduct the Monte Carlo simulation under the HPC cluster by using Matlab. More precisely, we show you how to compile Matlab files to avoid the license constraint and then the procedure to submit jobs to the cluster. We will consider the case where each independent job is treated as a serial job for one node of the cluster and we do not consider parallel programming. Please refer to the article, “Running MATLAB Program in Parallel”, if you are interested in parallel programing by using Matlab.
Matlab files
Monte Carlo simulation often requires running independent jobs under a variety of parameter values. To this end, we consider the way in which Matlab programs are compiled once and we feed some values to the compiled files. Advantages of this approach are that it is not necessary to compile Matlab files for every parameter value and that we can run independent jobs simultaneously by using one compiled file.
For the compiled Matlab files to take outside values, it is necessary to write any files as functions, even the main files. Suppose that we have the main file, main.m, which depends on some functions, fun1.m and fun2.m. Consider that we only feed values to main.m after the compile and fun1.m and fun2.m take values assigned within main.m.
One important step in writing Matlab functions is to transform string input to numbers. Since the compiled file only takes string inputs, it is necessary to include such a step in the file taking outside values, such as main.m. To illustrate this point, consider that main.m takes two inputs, x1 and x2, we have to write
function main(s_x1, s_x2) x1 = str2double(s_x1); x2 = str2double(s_x2); . . .
In addition, to save simulation output contained in array called “Rmat” for instance, we include lines at the end of min.m:
Rname = sprintf('Result_%d _%d.txt', x1, x2); save( Rname, 'Rmat', '-ASCII' );
where the inputs x1 and x2 are assumed to take integer values. Notice that special treatments are not necessary for fun1.m and fun2.m as long as they take inputs assigned within main.m.
Compile
To compile Matlab files together, we can write a two-line shell script:
#!/bin/sh mcc-matlab2012a -m -i main.m fun1.m fun2.m
Here, we assume that the shell script and Matlab files are in the same directory but it could be changed with a minor modification. Running the script gives you a compiled file, run_min.sh.
Job submission
To run simulation jobs using the compiled file, we write another shell script, where input values take integer values from 1 to 4 and jobs are submitted to atlas5_large. The shell script is written as follows:
******* #!/bin/sh for (x1=1; x1<=4; ++x1) do for (x2=1; x2<=4; ++x2) do ## command line cmd="mcc-run2012a ./run_main.sh $x1 $x2" ## submission bsub -q atlas5_large -o jobAt5.o $cmd done done exit ******
This script submits independent jobs to the cluster, which could run the jobs simultaneously.
Conclusion
The HPC cluster aids to speed up the Monte Carlo simulation. This short note explains a way to compile Matlab files once and feed parameter values to the compiled file from outside, which facilitates running multiple independent jobs given a variety of parameter values.