Wednesday, December 03, 2008

Photos from the IGT2008

The IGT 2008 annual event just ended. Here is a link to a few photos I took using my mobile phone.

Thursday, November 20, 2008

IGT HPC Work Group Meeting

- Intel® Parallel Software Tools for HPC and Multi-Core Computing
- Erlang for 5 nines

Monday, December 8th, 2008

IGT Offices, Maskit 4, 5th Floor, Hertzliya.

14:00-14:15: OPENING - Avner & Guy

14:15-15:00: Intel® Parallel Software Tools for HPC and Multi-Core Computing

Herbert Cornelius, Director Advanced Computing Center EMEA and
Alexander Supalov, Engineering Manager Intel Cluster Tools

Abstract

As we see Moore's Law still continuing, more and more parallelism is introduced to all computing platforms and on all levels of integration and programming. Especially in the area of High-Performance Computing (HPC) users can entertain a combination of different hardware and software parallel architectures and programming environments. Those technologies range from vectorization and SIMD computation over shared memory multi-threading (e.g. OpenMP) to distributed memory message passing (e.g. MPI) on cluster systems. But it also puts more and more demand on software (development) tools for parallel computing in respect to the various challenges and opportunities. This talk will briefly outline major hardware architectural trends in HPC (including multi-core) and its associated parallel software tools. We will cover the respective Intel® Software Tools for the main phases of parallel software development: implementation, correctness, analysis and performance. A special emphasis will be put on the MPI message passing library as a vital component for scalable high-performance applications on modern cluster systems.

15:00-15:15 Break

15:15-16:00: Erlang for 5 nines

Zvi Avraham, CTO, Nivertech Ltd.


Abstract

Zvi will present on the concurrent soft real time functional
programming language Erlang and its OTP (Open Telecom Platform)
framework. He will explain why Erlang programs are generally 4 - 10
times shorter than their counterparts in Java, C and C++ achieving
99,999% availability. The talk also explains why Erlang, originally
invented to handle the next generation of Telecom products, has been
successful in a much wider range of sectors including Internet,
banking and e-commerce.

16:00-16:15: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to info@grid.org.il

We are looking forward to seeing you!

Best Regards,

Guy Tel-Zur, Ph.D.

IGT Chairman

www.Grid.org.il

Sunday, November 16, 2008

VPN over Cellular

I like my Nokia 6210 Navigator, I use it to connect securely via VPN to the Ben-Gurion University domain when I am out of office. Below are two pictures that show how it works.

I connect the phone via a USB cable (Bluetooth is also possible), then I use the Nokia PC-Suite to connect to the Internet while the mobile phone acts as a MODEM.

I then use the VPN (RSA SecurID) to connect to the campus.

IMG_1799_c1

IMG_1800

Two comments:

1) To make the test fully reliable the WiFi was completely switched-off using the airplane mode button located at the front of my T61 ThinkPad notebook.

2) I wrote this post using the new Microsoft Live Writer Beta – a multi-blog- editor which I find useful.

Tuesday, November 11, 2008

The 2008 Comsol Conference in Tel Aviv

Comsol is an interesting tool for Multi-Physics simulations.
Today took place the annual conference of Comsol in Israel, the agenda is described below:

Here are two photos I took using my cellphone:


The conference building:


I have serious doubts how one can do serious Multi-Physics simulations without serious Parallel Processing capabilities. The current version, 3.5, of Comsol has no Distributed Parallel Processing support, e.g. MPI and a limited Multi-Core support but not at the user level; No Parallel Computing commands and no external Parallel Tools like Star-P. This limits the problem size that can be simulated.
However, for sure it is an excellent tool for education.

Monday, October 06, 2008

IGT HPC Work Group Meeting - Intel TBB & Microsoft HPC 2008 Server

Monday, October 27th, 2008.

IGT Offices, Maskit 4, 5th Floor, Hertzliya.

14:00-14:15: OPENING - Avner & Guy

14:15-15:00: High-Performance and Productivity Computing with Windows HPC

Mr. Doron Caspin, Architect Evangelist, Microsoft Israel.

Abstract

It's a fact: Windows HPC Server 2008 (HPCS) combines the power of the Windows Server platform with rich, out-of-the-box functionality to help improve the productivity and reduce the complexity of your HPC environment. Windows HPC Server 2008 can efficiently scale to thousands of processing cores and provides a comprehensive set of deployment, administration, and monitoring tools that are easy to deploy, manage, and integrate with your existing infrastructure.This lecture will provide an overview on Microsoft HPC solution.

15:00-15:15 Break

15:15-16:00: Introduction to Intel Threading Building Blocks (TBB)

Dr. Ami Marowka, Department of Software Engineering, Shenkar College of Engineering and Design, Israel.

Abstract

Intel Threading Building Blocks (TBB) is a C++ template library aiming for developing parallel applications running on top of multi-core processors. The library consists of building blocks (data structures and algorithms) that free a programmer from some complications arising from the use of native threading mechanisms such as threads creation, synchronization, and termination. TBB abstracts access to the multiple processors by automating the process of data decomposition to small cache-friendly chunks of data called "tasks", and then schedules them to individual cores dynamically in an efficient manner. This approach enables the programmer to focus in the program logic rather than in the ways to optimize the program to the underlying machine architecture.

16:00-16:15: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to info@grid.org.il

We are looking forward seeing you!

Best Regards,

Guy Tel-Zur, Ph.D.

IGT Chairman

www.Grid.org.il

Wednesday, September 24, 2008

IGT 2008 - Cloud Computing Conference

Monday, September 01, 2008

IGT2008 - The World Summit of Cloud Computing

IGT2008 - The World Summit of Cloud Computing conference registration is now open.

IGT is pleased to announce the opening of registration for the IGT2008 conference to be held in Israel on 1-2 December 2008.

http://www.cloudcomputing.org.il/

IGT2008 will focus on Cloud Computing, and its impact on the enterprise IT, the next generation data center, SaaS and Utility Computing.

This year, keynote speakers will include top technology leaders from companies that are creating and influencing the Cloud Computing technologies and business.

Early Bird prices are available until 16 October therefore it is advised to register in advance!

See you at IGT2008!

Cloud Computing - The New IT Economy

Avner Algom
General Manager
The Israeli Association of Grid Technologies (IGT)
www.Grid.org.il

Thursday, August 21, 2008

Cloud Computing - A case study

Next time you will need computing resources, e.g. a web server, get it from the cloud, it is simple and easy!
This post is not meant to be a full tutorial but only a brief description of my own personal experience playing with Amazon EC2.
For more complete information about working with Amazon EC2 check this or this references or browse here for additional documentation

Step 1: Create an Amazon Elastic Compute Cloud (EC2) account. Pricing and additional information is available here.
Step 2: Download the Amazon command line API tools and/or install the Elasticfox - Firefox plug-in.
Step 3: Set the security keys to allow SSH.
Step 4: Configure the Elasticfox (for example open port 22 for SSH and port 80 for http in my case).
Step 5. I used one of the pre-configured Amazon Machine Images (AMI), therefore starting the machine was immediate.
Step 6: Enjoy the new virtual machine - see screen shots below:


Figure 1: The Amazon Elasticfox GUI.


Figure 2: Connecting to the machine via SSH.


Figure 3: Starting a web server on the virtual machine.

Cloud Computing at its best!

Basic price is $0.1/hour ($2.4/day or $876/year) - Not bad if you take into account the Total Cost of Ownership (TCO) - No worries about up-time (UPS) - [that remains to be seen - there were a few unpleasant surprises], backup, cooling, space, electric bills....

Outlook:
Easy user interface together with a reasonable security and cost effectiveness will make Cloud Computing a serious alternative to local data centers in organizations.

Saturday, July 12, 2008

Farewell Madison Wisconsin

I left Madison today after a visit of a week.
A great city, a great university and great people.
I Hope to visit there again soon.
UW-Madison Logo

Monday, June 16, 2008

Grids are Dead! Or are they?

Grids are Dead! Or are they?
By Wolfgang Gentzsch, DEISA; Duke University.
Published in GridToday, June 16, 2008.
See also one of my previous posts: "The End of Grid Computing?" from October 2007.

Friday, June 13, 2008

How to securely connect to the BGU from Linux

The following instructions apply for connecting from outside the campus:

Step 1: VPN access:
Install CheckPoint SNX VPN tool (SSL Network eXtender) for Linux.
( the current version I installed works with OpenSuse 10.3 but crashes for Fedora core 9 )
telzur@gtz2:~> snx -s vpn.bgu.ac.il -u username@vpn
then provide your PIN and the Secure-ID code.

Step 2: Connect to the Windows terminal server using remote-desktop:
telzur@gtz2:~> rdesktop -u user-name -d group-of-users remote-host.bgu.ac.il

and you will get this window to login:

Friday, May 30, 2008

Erlang - software for a concurrent world

I would like to recommend watching the following on-line presentation:
InfoQ: Erlang - software for a concurrent world by joe Amstrong, Erlang's principle inventor.
These days I am also reading his book.
Erlang is interesting and intrguing

Tuesday, May 27, 2008

Grid Computing in Israel

Last week I attendded the HP-Cast 10 conference which took place in Singapore.
At the conference, as a delegate of the Israeli Association of Grid Technologies, I gave an overview presentation about Grid Computing in Israel. The presentation PDF file is available here.

Tuesday, April 29, 2008

Multi-Core 2008



Multi-Core 2008
May 14, 2008
http://www.cs.bgu.ac.il/~frankel/MultiCore08/index.html
The event program

Friday, April 11, 2008

My profile photo

My profile photo is in fact a business card with my details inside. It is a barcode readable by Nokia phones with a built-in camera, e.g. N-95.
For more information about this cool thing check this link: http://mobilecodes.nokia.com/scan.htm
Point and shoot your camera phone at the mobile code and if you have a barcode reader preinstalled you will be able to store my details.
Unfortunately, my mobile phone has no camera so I can not check it myself. If you have a mobile phone with a camera and barcode reader please send me a feedback if it really works.


Friday, March 07, 2008

IGT Grid-HPC WG Meeting: UNICORE - A European Grid Technology

THE NEXT GRID-HPC WORK GROUP MEETING

Monday, March 31th, 2008

IGT Offices, Maskit 4, 5th Floor, Hertzliya

Agenda:

14:00-14:15: OPENING - Avner & Guy

14:15-15:15: UNICORE - A European Grid Technology

Speaker: Dr. Achim Streit

Head of Division "Distributed Systems and Grid Computing"

Jülich Supercomputing Center

Insitute for Advanced Simulation

Forschungszentrum Jülich GmbH

Abstract:

The development of UNICORE started back in 1997 with two projects funded by the German ministry of education and research (BMBF). UNICORE is a vertically integrated Grid middleware, which provides a seamless, secure, and intuitive access to distributed resources and data and provides components on all levels of a Grid architecture from an easy-to-use graphical client down to the interfaces to the Grid resources. Furthermore, UNICORE has a strong support for workflows while security is established through X.509 certificates. Since 2002 UNICORE is continuously improved to mature production ready quality and enhanced with more functionalities in several European projects. Today UNICORE is used in several national and international Grid infrastructures like D-Grid and DEISA and is also providing access to the national Supercomputer of the NIC in Germany. The talk will give details about the new version of UNICORE 6, which is web-services enabled, OGSA-based and standards-compliant. To begin with the underlying design principles and concepts of UNICORE are presented. A detailed architecture diagram shows the different components of UNICORE 6 and its interdependencies. This is followed by a view on the adoption of common open standards in UNICORE 6, which allows interoperability with other Grid technologies and a realisation of an open and extensible architecture. The talk closes with some interesting examples, where the UNICORE Grid technology is used. The European UNICORE Grid Middleware is available as Open Source from http://www.unicore.eu

15:15-15:20: BREAK

15:20-15:30: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to: info@grid.org.il

We are looking forward to seeing you!

Best Regards,

Guy Tel-Zur

Grid-HPC WG Director

IGT

www.Grid.org.il


Date Mar 31, 2008 14:00 16:30
Location IGT Offices, Maskit 4, 5th Floor, Hertzelia
The lecture will be transmitted live via WebEx

Tuesday, February 05, 2008

Cloud Computing is here to stay

First trials with Xcerion XIOS/3 Beta.

A remote OS and applications consumed completely from the "Cloud". All you need is a Web Browser.

Xcerion looks cool although at the moment it works only on top of IE6+ and not yet on top of Firefox.
The response time is reasonable and the Windows Manager is nice and intuitive:


(click on the image to enlarge)

A thin client/OLPC and Xcerion can be an interesting cheap computing platform.

Wednesday, January 23, 2008

Cellular Automata - Part II, Using Condor

My post on Cellular Automata from January 12 was not put there by mistake. I want to use it as a starting point for a couple of exercises in my Parallel Processing course.
In that post I gave a few drawings that differ only by the generating rule number.

Today, I am going to show how using the Condor High-Throughput Computing system allows to handle in a very simple way large volume of computations.

I used this simple Condor submit file:

universe = vanilla
executable = nks.py
Error = err.$(Process)
Output= out.$(Process)
Log = log.$(Process)
Arguments = $(Process)
Queue 256

And was able to compute the whole set of 256 rules (jobs) with the same effort of computing a single rule.
I submitted the task to my Personal Condor on my laptop and was not disappointed; After a while I got all the outputs happily waiting for post processing.

Sunday, January 20, 2008

A special lecture at the BGU by Barton Miller



Distinguished Lecture Guest: Prof. Barton P. Miller
Computer Science Department
University of Wisconsin - Madison

Monday, January 28th, 2008 14:00-16:00 in the Saal Auditorium (202), Alon Hi-Tech Bldg (37)
at the Ben-Gurion University of the Negev, Beer-Sheva


A Framework for Binary Code Analysis and Static and Dynamic Patching

Barton P. Miller
Computer Sciences Department
University of Wisconsin
Madison, WI 53706
bart@cs.wisc.edu

Tools that analyze and modify binary code are crucial to many areas of computer science, including cyber forensics, program tracing, debugging, testing, performance profiling, performance modeling, and software engineering. While there are many tools used to support these activities, these tools have significant limitations in functionality, efficiency, accuracy, portability, and availability.

To overcome these limitations, we are actively working on the design and implementation of a new framework for binary code analysis and modification. The goal of this framework is to provide a component architecture to support tools that analyze binary code and modify it both statically (binary rewriting) and dynamically (dynamic instrumentation), allow for interoperability of the static and dynamic code modification, and enable the sharing and leveraging of this complex technology.

Characteristics of this framework include:

* multi-architecture, multi-format, and multi-operating system;
* library-based, so that components can be used separately as needed;
* open source, to allow both local control and auditing;
* extensible data structures, so that new analyses and interfaces can be added easily;
* exportable data structures, so that all analysis products will be stored in a format that can be readily used by other tools;
* batch enabled, so that tools can operate effectively without interactive control;
* testable, with each separate component provided with a detailed test suite;
* accurate and efficient, using best-known current algorithms and the addition of new algorithms for code parsing;
* up to date, handling modern binary code idioms like exceptions, and functions with non-contiguous and shared code.

This component-based approach requires identifying key portable and multi-platform abstractions at every level of functionality. If we (as a community) are successful in transitioning to this model, we can break the "hero model" of tool development of having each group trying to produce its own end-to-end complete tool sets.

The Paradyn group is working on several initial library components of this effort, including symbol table parsers, binary code scanners (instruction decoders), binary code parsers (control flow analysis), dynamic code generators, stack walkers, process execution controllers, and a visual binary code editor.

The goal of this talk is to lay out the motivation, plans, and current progress for this project. We also hope to solicit feedback on both its design and functionality.

====================
This talk is in the framework of the GriVAC collaboration meeting at the BGU

Saturday, January 12, 2008

The mysteries of Cellular Automata

I recently bought Stephen Wolfram's book "New Kind of Science". The book is interesting and I highly recommend it.
In the spirit of one of my favorite phrases by Confucius:
"I hear and I forget. I see and I remember. I do and I understand" I decided to reproduce some of the first examples that are given in the book. I wrote about 100 lines of Python code and enjoyed the beauty of the results.
The whole book is available online and I am referring here to the plots on page 55. Below enclosed a few of my plots.

Rule 25:


Rule 22:


Rule 30:


Rule 60:


Rule 73:


It is still a mystery for me the richness of the patterns that are produced from very simple interaction rules.
I would call it Social Networking by pixels.

Tuesday, January 08, 2008

The next IGT HPC work group meeting



Monday, January 14th, 2008

IGT Offices, Maskit 4, 5th Floor, Hertzliya

14:00-14:15: OPENING - Avner & Guy

14:15-15:00: “GridGain – Java Grid Computing Made Simple”

Speaker: Nikita Ivanov, Founder GridGain Systems
Duration: 45minutes
Language: English

Abstract:
This presentation is intended to provide a high level overview on
GridGain– an open source Java grid computing framework. Presentation is
arranged to provide both the canonical overview of the software
framework and live coding demonstration underscoring powerful simplicity
of the GridGain. Presentation is split into approximately two equal
parts:
* In first part the formal introduction to GridGain and
computational grid computing is provided. Different types of

grid computing will be briefly discussed as well as the key
features that GridGain provides
* In the second part the live coding example will be shown
.demonstrating building and running the grid application from
scratch in front of the audience. This demonstration will
highlight one of the key advantages of using GridGain - simple
and transparent grid enabling of existing applications using
Java annotations.

15:00-15:10 Break

15:10-15:45 Grid Computing – The Simpsons, Shrek3 and helps Keep the Lights On

Speaker: Mike Duffy, Founder CEO of Axceleon

Duration: 35minutes
Language: English

Abstract:

I wanted to speak to how Grid Computing is used in 2 very interesting and important industries, the Hollywood Movie Business and The Power/Energy Area. Without the ability to build these and utilize large compute grids the movie industry would not be able to generate these large special effects movies like the Transformers, Shrek, Simpsons etc. The Electrical Generation companies around the world are looking to “grid computing” to not only save the money but also increase the reliability of their electrical grids across the world.

15:45-16:00: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to info@grid.org.il

We are looking forward to seeing you!

Best Regards,


Guy Tel-Zur, Ph.D.

Grid-HPC WG Director

IGT

www.Grid.org.il




Saturday, January 05, 2008

First trials with Hadoop


I followed the Hadoop Quickstart guide and the whole process is described below.

This post can be used as a reference for other people installing Hadoop.

My system is OpenSuse 10.3 and Java version is 1.6.0_03.

After downloading and installing the package I did the Standalone operation test:

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
$ cat output/*

Here is the output (line feeds may be corrupted, sorry for that):

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop jar hadoop-0.14.4-examples.jar grep input output 'dfs[a-z.]+' 08/01/05 15:47:13 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 08/01/05 15:47:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/01/05 15:47:13 INFO mapred.JobClient: Running job: job_local_1 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/mapred-default.xml:0+180 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:13 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:13 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.TaskRunner: Saved output of task 'reduce_3r6jh8' to file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784 08/01/05 15:47:14 INFO mapred.JobClient: Job complete: job_local_1 08/01/05 15:47:14 INFO mapred.JobClient: Counters: 9 08/01/05 15:47:14 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 15:47:14 INFO mapred.JobClient: Map input records=940 08/01/05 15:47:14 INFO mapred.JobClient: Map output records=34 08/01/05 15:47:14 INFO mapred.JobClient: Map input bytes=27847 08/01/05 15:47:14 INFO mapred.JobClient: Map output bytes=942 08/01/05 15:47:14 INFO mapred.JobClient: Combine input records=34 08/01/05 15:47:14 INFO mapred.JobClient: Combine output records=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce input groups=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce input records=33 08/01/05 15:47:14 INFO mapred.JobClient: Reduce output records=33 08/01/05 15:47:14 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 08/01/05 15:47:14 INFO mapred.FileInputFormat: Total input paths to process : 1 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/mapred-default.xml:0+180 08/01/05 15:47:14 INFO mapred.JobClient: Running job: job_local_1 08/01/05 15:47:14 INFO mapred.MapTask: numReduceTasks: 1 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.TaskRunner: Saved output of task 'reduce_h016y4' to file:/home/telzur/downloads/hadoop-0.14.4/output 08/01/05 15:47:14 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-default.xml:0+27489 08/01/05 15:47:14 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/input/hadoop-site.xml:0+178 08/01/05 15:47:15 INFO mapred.LocalJobRunner: reduce > reduce 08/01/05 15:47:15 INFO mapred.JobClient: Job complete: job_local_1 08/01/05 15:47:15 INFO mapred.JobClient: Counters: 7 08/01/05 15:47:15 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 15:47:15 INFO mapred.JobClient: Map input records=33 08/01/05 15:47:15 INFO mapred.JobClient: Map output records=33 08/01/05 15:47:15 INFO mapred.JobClient: Map input bytes=1193 08/01/05 15:47:15 INFO mapred.JobClient: Map output bytes=929 08/01/05 15:47:15 INFO mapred.JobClient: Reduce input groups=4 08/01/05 15:47:15 INFO mapred.JobClient: Reduce input records=66 08/01/05 15:47:15 INFO mapred.JobClient: Reduce output records=66 08/01/05 15:47:15 INFO mapred.LocalJobRunner: file:/home/telzur/downloads/hadoop-0.14.4/grep-temp-346467784/part-00000:0+1279 gtz2:/home/telzur/downloads/hadoop-0.14.4 # cat output/* 2 dfs. 1 dfs.block.size 1 dfs.blockreport.interval 1 dfs.client.block.write.retries 1 dfs.client.buffer.dir 1 dfs.data.dir 1 dfs.datanode.bind 1 dfs.datanode.dns.interface 1 dfs.datanode.dns.nameserver 1 dfs.datanode.du.pct 1 dfs.datanode.du.reserved 1 dfs.datanode.port 1 dfs.default.chunk.view.size 1 dfs.df.interval 1 dfs.heartbeat.interval 1 dfs.hosts 1 dfs.hosts.exclude 1 dfs.impl 1 dfs.info.bind 1 dfs.info.port 1 dfs.name.dir 1 dfs.namenode.handler.count 1 dfs.namenode.logging.level 1 dfs.network.script 1 dfs.replication 1 dfs.replication.consider 1 dfs.replication.max 1 dfs.replication.min 1 dfs.replication.min. 1 dfs.safemode.extension 1 dfs.safemode.threshold.pct 1 dfs.secondary.info.bind 1 dfs.secondary.info.port gtz2:/home/telzur/downloads/hadoop-0.14.4 #


Next step: Pseudo-Distributed Operation

Format a new distributed-filesystem:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop namenode -format
08/01/05 17:06:57 INFO dfs.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = gtz2/127.0.0.1
STARTUP_MSG: args = [-format]
************************************************************/
08/01/05 17:06:58 INFO dfs.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
08/01/05 17:06:58 INFO dfs.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at gtz2/127.0.0.1
************************************************************/
gtz2:/home/telzur/downloads/hadoop-0.14.4 #

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/start-all.sh
starting namenode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-telzur-namenode-gtz2.out
localhost: starting datanode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-datanode-gtz2.out
localhost: starting secondarynamenode, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-secondarynamenode-gtz2.out
starting jobtracker, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-telzur-jobtracker-gtz2.out
localhost: starting tasktracker, logging to /home/telzur/downloads/hadoop-0.14.4/bin/../logs/hadoop-root-tasktracker-gtz2.out
gtz2:/home/telzur/downloads/hadoop-0.14.4 #


Browsing the web interface:






Run the examples:

gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop dfs -put conf input gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' 08/01/05 17:21:39 INFO mapred.FileInputFormat: Total input paths to process : 10 08/01/05 17:21:40 INFO mapred.JobClient: Running job: job_200801051712_0001 08/01/05 17:21:41 INFO mapred.JobClient: map 0% reduce 0% 08/01/05 17:21:51 INFO mapred.JobClient: map 18% reduce 0% 08/01/05 17:21:52 INFO mapred.JobClient: map 36% reduce 0% 08/01/05 17:21:53 INFO mapred.JobClient: map 54% reduce 0% 08/01/05 17:21:54 INFO mapred.JobClient: map 63% reduce 0% 08/01/05 17:21:55 INFO mapred.JobClient: map 72% reduce 0% 08/01/05 17:21:56 INFO mapred.JobClient: map 90% reduce 0% 08/01/05 17:21:57 INFO mapred.JobClient: map 100% reduce 0% 08/01/05 17:22:06 INFO mapred.JobClient: map 100% reduce 27% 08/01/05 17:22:07 INFO mapred.JobClient: map 100% reduce 100% 08/01/05 17:22:08 INFO mapred.JobClient: Job complete: job_200801051712_0001 08/01/05 17:22:08 INFO mapred.JobClient: Counters: 12 08/01/05 17:22:08 INFO mapred.JobClient: Job Counters 08/01/05 17:22:08 INFO mapred.JobClient: Launched map tasks=11 08/01/05 17:22:08 INFO mapred.JobClient: Launched reduce tasks=1 08/01/05 17:22:08 INFO mapred.JobClient: Data-local map tasks=11 08/01/05 17:22:08 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 17:22:08 INFO mapred.JobClient: Map input records=1153 08/01/05 17:22:08 INFO mapred.JobClient: Map output records=43 08/01/05 17:22:08 INFO mapred.JobClient: Map input bytes=34316 08/01/05 17:22:08 INFO mapred.JobClient: Map output bytes=1118 08/01/05 17:22:08 INFO mapred.JobClient: Combine input records=43 08/01/05 17:22:08 INFO mapred.JobClient: Combine output records=39 08/01/05 17:22:08 INFO mapred.JobClient: Reduce input groups=38 08/01/05 17:22:08 INFO mapred.JobClient: Reduce input records=39 08/01/05 17:22:08 INFO mapred.JobClient: Reduce output records=38 08/01/05 17:22:08 INFO mapred.FileInputFormat: Total input paths to process : 1 08/01/05 17:22:09 INFO mapred.JobClient: Running job: job_200801051712_0002 08/01/05 17:22:10 INFO mapred.JobClient: map 0% reduce 0% 08/01/05 17:22:18 INFO mapred.JobClient: map 100% reduce 0% 08/01/05 17:22:24 INFO mapred.JobClient: map 100% reduce 100% 08/01/05 17:22:25 INFO mapred.JobClient: Job complete: job_200801051712_0002 08/01/05 17:22:25 INFO mapred.JobClient: Counters: 10 08/01/05 17:22:25 INFO mapred.JobClient: Job Counters 08/01/05 17:22:25 INFO mapred.JobClient: Launched map tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Launched reduce tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Data-local map tasks=1 08/01/05 17:22:25 INFO mapred.JobClient: Map-Reduce Framework 08/01/05 17:22:25 INFO mapred.JobClient: Map input records=38 08/01/05 17:22:25 INFO mapred.JobClient: Map output records=38 08/01/05 17:22:25 INFO mapred.JobClient: Map input bytes=1330 08/01/05 17:22:25 INFO mapred.JobClient: Map output bytes=1026 08/01/05 17:22:25 INFO mapred.JobClient: Reduce input groups=3 08/01/05 17:22:25 INFO mapred.JobClient: Reduce input records=38 08/01/05 17:22:25 INFO mapred.JobClient: Reduce output records=38

Examine the output files:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/hadoop dfs -get output output gtz2:/home/telzur/downloads/hadoop-0.14.4 # cat output/* cat: output/output: Is a directory 2 dfs. 1 dfs.block.size 1 dfs.blockreport.interval 1 dfs.client.block.write.retries 1 dfs.client.buffer.dir 1 dfs.data.dir 1 dfs.datanode.bind 1 dfs.datanode.dns.interface 1 dfs.datanode.dns.nameserver 1 dfs.datanode.du.pct 1 dfs.datanode.du.reserved 1 dfs.datanode.port 1 dfs.default.chunk.view.size 1 dfs.df.interval 1 dfs.heartbeat.interval 1 dfs.hosts 1 dfs.hosts.exclude 1 dfs.impl 1 dfs.info.bind 1 dfs.info.port 1 dfs.name.dir 1 dfs.namenode.handler.count 1 dfs.namenode.logging.level 1 dfs.network.script 1 dfs.replication 1 dfs.replication.consider 1 dfs.replication.max 1 dfs.replication.min 1 dfs.replication.min. 1 dfs.safemode.extension 1 dfs.safemode.threshold.pct 1 dfs.secondary.info.bind 1 dfs.secondary.info.port gtz2:/home/telzur/downloads/hadoop-0.14.4 #

Re-check the web interfaces after the job ended:







Finally, stop the daemons when we are done:
gtz2:/home/telzur/downloads/hadoop-0.14.4 # bin/stop-all.sh stopping jobtracker localhost: stopping tasktracker stopping namenode localhost: stopping datanode localhost: stopping secondarynamenod

and that concludes the Hadoop Quitestart tutorial



Wednesday, January 02, 2008

Howto encrypt/decrypt a file using openssl

To encrypt:
# openssl bf -a -salt -in original_file.odt -out encrypted_file.bf
you will be prompt to type and then re-type a password

here: bf - stands for the Blow Fish algorithm

To decrypt:
# openssl bf -d -salt -a -in ./encrypted_file1.bf -out ./original_file.odt
use the same password when asked.

-d stands for decryption

For more information and examples type: man enc