Wednesday, January 23, 2008

Cellular Automata - Part II, Using Condor

My post on Cellular Automata from January 12 was not put there by mistake. I want to use it as a starting point for a couple of exercises in my Parallel Processing course.
In that post I gave a few drawings that differ only by the generating rule number.

Today, I am going to show how using the Condor High-Throughput Computing system allows to handle in a very simple way large volume of computations.

I used this simple Condor submit file:

universe = vanilla
executable = nks.py
Error = err.$(Process)
Output= out.$(Process)
Log = log.$(Process)
Arguments = $(Process)
Queue 256

And was able to compute the whole set of 256 rules (jobs) with the same effort of computing a single rule.
I submitted the task to my Personal Condor on my laptop and was not disappointed; After a while I got all the outputs happily waiting for post processing.

Sunday, January 20, 2008

A special lecture at the BGU by Barton Miller



Distinguished Lecture Guest: Prof. Barton P. Miller
Computer Science Department
University of Wisconsin - Madison

Monday, January 28th, 2008 14:00-16:00 in the Saal Auditorium (202), Alon Hi-Tech Bldg (37)
at the Ben-Gurion University of the Negev, Beer-Sheva


A Framework for Binary Code Analysis and Static and Dynamic Patching

Barton P. Miller
Computer Sciences Department
University of Wisconsin
Madison, WI 53706
bart@cs.wisc.edu

Tools that analyze and modify binary code are crucial to many areas of computer science, including cyber forensics, program tracing, debugging, testing, performance profiling, performance modeling, and software engineering. While there are many tools used to support these activities, these tools have significant limitations in functionality, efficiency, accuracy, portability, and availability.

To overcome these limitations, we are actively working on the design and implementation of a new framework for binary code analysis and modification. The goal of this framework is to provide a component architecture to support tools that analyze binary code and modify it both statically (binary rewriting) and dynamically (dynamic instrumentation), allow for interoperability of the static and dynamic code modification, and enable the sharing and leveraging of this complex technology.

Characteristics of this framework include:

* multi-architecture, multi-format, and multi-operating system;
* library-based, so that components can be used separately as needed;
* open source, to allow both local control and auditing;
* extensible data structures, so that new analyses and interfaces can be added easily;
* exportable data structures, so that all analysis products will be stored in a format that can be readily used by other tools;
* batch enabled, so that tools can operate effectively without interactive control;
* testable, with each separate component provided with a detailed test suite;
* accurate and efficient, using best-known current algorithms and the addition of new algorithms for code parsing;
* up to date, handling modern binary code idioms like exceptions, and functions with non-contiguous and shared code.

This component-based approach requires identifying key portable and multi-platform abstractions at every level of functionality. If we (as a community) are successful in transitioning to this model, we can break the "hero model" of tool development of having each group trying to produce its own end-to-end complete tool sets.

The Paradyn group is working on several initial library components of this effort, including symbol table parsers, binary code scanners (instruction decoders), binary code parsers (control flow analysis), dynamic code generators, stack walkers, process execution controllers, and a visual binary code editor.

The goal of this talk is to lay out the motivation, plans, and current progress for this project. We also hope to solicit feedback on both its design and functionality.

====================
This talk is in the framework of the GriVAC collaboration meeting at the BGU

Saturday, January 12, 2008

The mysteries of Cellular Automata

I recently bought Stephen Wolfram's book "New Kind of Science". The book is interesting and I highly recommend it.
In the spirit of one of my favorite phrases by Confucius:
"I hear and I forget. I see and I remember. I do and I understand" I decided to reproduce some of the first examples that are given in the book. I wrote about 100 lines of Python code and enjoyed the beauty of the results.
The whole book is available online and I am referring here to the plots on page 55. Below enclosed a few of my plots.

Rule 25:


Rule 22:


Rule 30:


Rule 60:


Rule 73:


It is still a mystery for me the richness of the patterns that are produced from very simple interaction rules.
I would call it Social Networking by pixels.

Tuesday, January 08, 2008

The next IGT HPC work group meeting



Monday, January 14th, 2008

IGT Offices, Maskit 4, 5th Floor, Hertzliya

14:00-14:15: OPENING - Avner & Guy

14:15-15:00: “GridGain – Java Grid Computing Made Simple”

Speaker: Nikita Ivanov, Founder GridGain Systems
Duration: 45minutes
Language: English

Abstract:
This presentation is intended to provide a high level overview on
GridGain– an open source Java grid computing framework. Presentation is
arranged to provide both the canonical overview of the software
framework and live coding demonstration underscoring powerful simplicity
of the GridGain. Presentation is split into approximately two equal
parts:
* In first part the formal introduction to GridGain and
computational grid computing is provided. Different types of

grid computing will be briefly discussed as well as the key
features that GridGain provides
* In the second part the live coding example will be shown
.demonstrating building and running the grid application from
scratch in front of the audience. This demonstration will
highlight one of the key advantages of using GridGain - simple
and transparent grid enabling of existing applications using
Java annotations.

15:00-15:10 Break

15:10-15:45 Grid Computing – The Simpsons, Shrek3 and helps Keep the Lights On

Speaker: Mike Duffy, Founder CEO of Axceleon

Duration: 35minutes
Language: English

Abstract:

I wanted to speak to how Grid Computing is used in 2 very interesting and important industries, the Hollywood Movie Business and The Power/Energy Area. Without the ability to build these and utilize large compute grids the movie industry would not be able to generate these large special effects movies like the Transformers, Shrek, Simpsons etc. The Electrical Generation companies around the world are looking to “grid computing” to not only save the money but also increase the reliability of their electrical grids across the world.

15:45-16:00: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to info@grid.org.il

We are looking forward to seeing you!

Best Regards,


Guy Tel-Zur, Ph.D.

Grid-HPC WG Director

IGT

www.Grid.org.il




Saturday, January 05, 2008

First trials with Hadoop


I followed the Hadoop Quickstart guide and the whole process is described below.

This post can be used as a reference for other people installing Hadoop.

My system is OpenSuse 10.3 and Java version is 1.6.0_03.

After downloading and installing the package I did the Standalone operation test:

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
$ cat output/*

Here is the output (line feeds may be corrupted, sorry for that):

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop jar hadoop-0.14.4-examples.jar grep input output 'dfs[a-z.]+' 08/01/05 15:47:13 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 08/01/05 15:47:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/01/05 15:47:13 INFO mapred.JobClient: Running job: job_local_1 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/mapred-default.xml:0+180 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.TaskRunner: Saved output of task 'reduce_3r6jh8' to file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784 08/01/05 15:47:14 INFO mapred.JobClient: Job complete: job_local_1 08/01/05 15:47:14 INFO mapred.JobClient: Counters: 9 08/01/05 15:47:14 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 15:47:14 INFO mapred.JobClient: Map input records=940 08/01/05 15:47:14 INFO mapred.JobClient: Map output records=34 08/01/05 15:47:14 INFO mapred.JobClient: Map input bytes=27847 08/01/05 15:47:14 INFO mapred.JobClient: Map output bytes=942 08/01/05 15:47:14 INFO mapred.JobClient: Combine input records=34 08/01/05 15:47:14 INFO mapred.JobClient: Combine output records=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce input groups=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce input records=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce output records=33 08/01/05 15:47:14 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 08/01/05 15:47:14 INFO mapred.FileInputFormat: Total input paths to process : 1 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/mapred-default.xml:0+180 08/01/05 15:47:14 INFO mapred.JobClient: Running job: job_local_1 08/01/05 15:47:14 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.TaskRunner: Saved output of task 'reduce_h016y4' to file:/home/telzur/downloads/hadoop-0.14.4/output 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:15 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:15 INFO mapred.JobClient: Job complete: job_local_1 08/01/05 15:47:15 INFO mapred.JobClient: Counters: 7 08/01/05 15:47:15 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 15:47:15 INFO mapred.JobClient: Map input records=33 08/01/05 15:47:15 INFO mapred.JobClient: Map output records=33 08/01/05 15:47:15 INFO mapred.JobClient: Map input bytes=1193 08/01/05 15:47:15 INFO mapred.JobClient: Map output bytes=929 08/01/05 15:47:15 INFO mapred.JobClient: Reduce input groups=4 08/01/05 15:47:15 INFO mapred.JobClient: Reduce input records=66 08/01/05 15:47:15 INFO mapred.JobClient: Reduce output records=66 08/01/05 15:47:15 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 gtz2:/home/telzur/downloads/hadoop-0.14.4 # cat output/* 2 dfs. 1 dfs.block.size 1 dfs.blockreport.interval 1 dfs.client.block.write.retries 1 dfs.client.buffer.dir 1 dfs.data.dir 1 dfs.datanode.bind 1 dfs.datanode.dns.interface 1 dfs.datanode.dns.nameserver 1 dfs.datanode.du.pct 1 dfs.datanode.du.reserved 1 dfs.datanode.port 1 dfs.default.chunk.view.size 1 dfs.df.interval 1 dfs.heartbeat.interval 1 dfs.hosts 1 dfs.hosts.exclude 1 dfs.impl 1 dfs.info.bind 1 dfs.info.port 1 dfs.name.dir 1 dfs.namenode.handler.count 1 dfs.namenode.logging.level 1 dfs.network.script 1 dfs.replication 1 dfs.replication.consider 1 dfs.replication.max 1 dfs.replication.min 1 dfs.replication.min. 1 dfs.safemode.extension 1 dfs.safemode.threshold.pct 1 dfs.secondary.info.bind 1 dfs.secondary.info.port gtz2:/home/telzur/downloads/hadoop-0.14.4 #


Next step: Pseudo-Distributed Operation

Format a new distributed-filesystem:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop namenode -format
08/01/05 17:06:57 INFO dfs.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = gtz2/127.0.0.1
STARTUP_MSG: args = [-format]
************************************************************/
08/01/05 17:06:58 INFO dfs.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
08/01/05 17:06:58 INFO dfs.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at gtz2/127.0.0.1
************************************************************/
gtz2:/home/telzur/downloads/hadoop-0.14.4 #

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/start-all.sh
starting namenode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-telzur-namenode-gtz2.out
localhost: starting datanode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-datanode-gtz2.out
localhost: starting secondarynamenode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-secondarynamenode-gtz2.out
starting jobtracker, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-telzur-jobtracker-gtz2.out
localhost: starting tasktracker, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-tasktracker-gtz2.out
gtz2:/home/telzur/downloads/hadoop-0.14.4 #


Browsing the web interface:






Run the examples:

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop dfs -put conf input gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' 08/01/05 17:21:39 INFO mapred.FileInputFormat: Total input paths to process : 10 08/01/05 17:21:40 INFO mapred.JobClient: Running job: job_200801051712_0001 08/01/05 17:21:41 INFO mapred.JobClient: map 0% reduce 0% 08/01/05 17:21:51 INFO mapred.JobClient: map 18% reduce 0% 08/01/05 17:21:52 INFO mapred.JobClient: map 36% reduce 0% 08/01/05 17:21:53 INFO mapred.JobClient: map 54% reduce 0% 08/01/05 17:21:54 INFO mapred.JobClient: map 63% reduce 0% 08/01/05 17:21:55 INFO mapred.JobClient: map 72% reduce 0% 08/01/05 17:21:56 INFO mapred.JobClient: map 90% reduce 0% 08/01/05 17:21:57 INFO mapred.JobClient: map 100% reduce 0% 08/01/05 17:22:06 INFO mapred.JobClient: map 100% reduce 27% 08/01/05 17:22:07 INFO mapred.JobClient: map 100% reduce 100% 08/01/05 17:22:08 INFO mapred.JobClient: Job complete: job_200801051712_0001 08/01/05 17:22:08 INFO mapred.JobClient: Counters: 12 08/01/05 17:22:08 INFO mapred.JobClient: Job Counters 08/01/05 17:22:08 INFO mapred.JobClient: Launched map tasks=11 08/01/05 17:22:08 INFO mapred.JobClient: Launched reduce tasks=1 08/01/05 17:22:08 INFO mapred.JobClient: Data-local map tasks=11 08/01/05 17:22:08 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 17:22:08 INFO mapred.JobClient: Map input records=1153 08/01/05 17:22:08 INFO mapred.JobClient: Map output records=43 08/01/05 17:22:08 INFO mapred.JobClient: Map input bytes=34316 08/01/05 17:22:08 INFO mapred.JobClient: Map output bytes=1118 08/01/05 17:22:08 INFO mapred.JobClient: Combine input records=43 08/01/05 17:22:08 INFO mapred.JobClient: Combine output records=39 08/01/05 17:22:08 INFO mapred.JobClient: Reduce input groups=38 08/01/05 17:22:08 INFO mapred.JobClient: Reduce input records=39 08/01/05 17:22:08 INFO mapred.JobClient: Reduce output records=38 08/01/05 17:22:08 INFO mapred.FileInputFormat: Total input paths to process : 1 08/01/05 17:22:09 INFO mapred.JobClient: Running job: job_200801051712_0002 08/01/05 17:22:10 INFO mapred.JobClient: map 0% reduce 0% 08/01/05 17:22:18 INFO mapred.JobClient: map 100% reduce 0% 08/01/05 17:22:24 INFO mapred.JobClient: map 100% reduce 100% 08/01/05 17:22:25 INFO mapred.JobClient: Job complete: job_200801051712_0002 08/01/05 17:22:25 INFO mapred.JobClient: Counters: 10 08/01/05 17:22:25 INFO mapred.JobClient: Job Counters 08/01/05 17:22:25 INFO mapred.JobClient: Launched map tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Launched reduce tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Data-local map tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 17:22:25 INFO mapred.JobClient: Map input records=38 08/01/05 17:22:25 INFO mapred.JobClient: Map output records=38 08/01/05 17:22:25 INFO mapred.JobClient: Map input bytes=1330 08/01/05 17:22:25 INFO mapred.JobClient: Map output bytes=1026 08/01/05 17:22:25 INFO mapred.JobClient: Reduce input groups=3 08/01/05 17:22:25 INFO mapred.JobClient: Reduce input records=38 08/01/05 17:22:25 INFO mapred.JobClient: Reduce output records=38

Examine the output files:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop dfs -get output output gtz2:/home/telzur/downloads/hadoop-0.14.4 # cat output/* cat: output/output: Is a directory 2 dfs. 1 dfs.block.size 1 dfs.blockreport.interval 1 dfs.client.block.write.retries 1 dfs.client.buffer.dir 1 dfs.data.dir 1 dfs.datanode.bind 1 dfs.datanode.dns.interface 1 dfs.datanode.dns.nameserver 1 dfs.datanode.du.pct 1 dfs.datanode.du.reserved 1 dfs.datanode.port 1 dfs.default.chunk.view.size 1 dfs.df.interval 1 dfs.heartbeat.interval 1 dfs.hosts 1 dfs.hosts.exclude 1 dfs.impl 1 dfs.info.bind 1 dfs.info.port 1 dfs.name.dir 1 dfs.namenode.handler.count 1 dfs.namenode.logging.level 1 dfs.network.script 1 dfs.replication 1 dfs.replication.consider 1 dfs.replication.max 1 dfs.replication.min 1 dfs.replication.min. 1 dfs.safemode.extension 1 dfs.safemode.threshold.pct 1 dfs.secondary.info.bind 1 dfs.secondary.info.port gtz2:/home/telzur/downloads/hadoop-0.14.4 #

Re-check the web interfaces after the job ended:







Finally, stop the daemons when we are done:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/stop-all.sh stopping jobtracker localhost: stopping tasktracker stopping namenode localhost: stopping datanode localhost: stopping secondarynamenod

and that concludes the Hadoop Quitestart tutorial



Wednesday, January 02, 2008

Howto encrypt/decrypt a file using openssl

To encrypt:
# openssl bf -a -salt -in original_file.odt -out encrypted_file.bf
you will be prompt to type and then re-type a password

here: bf - stands for the Blow Fish algorithm

To decrypt:
# openssl bf -d -salt -a -in ./encrypted_file1.bf -out ./original_file.odt
use the same password when asked.

-d stands for decryption

For more information and examples type: man enc