Monday, December 24, 2007

A comment on Rebecca's post

A useful teaching example for Parallel Processing courses is given in Rebecca's Blog: "Example Performance Evaluation"
I added the following comment:

"I liked the example and I think this kind of demonstrations really help explaining were bottlenecks may occur in the parallel code. I would like to add two comments:
1. Solving the matrix and obtaining 0 for the startup time, after already neglecting the bandwidth, means that the problem is totally Embarrassingly Parallel and this is indeed the case for MC calculations. It would be interesting to repeat the exercise for a communication intensive application, e.g. Laplace equation.
2. I think it worth mentioning Collective Communication commands, i.e. Bcast and Reduce in this context. In particular Reduce+Summation can be more efficient than the loop over the Recv because it can do summation out of order and this may hide some of the holding/communication time. Another alternative is to use MPI ANY_SOURCE in the Recv command.

--Guy"

Tuesday, December 18, 2007

Cloud Computing


I created a new group in Facebook about Cloud Computing.
You are all welcome to join!
Click here to visit the group web page:
http://www.facebook.com/group.php?gid=8450870046

Wednesday, December 12, 2007

TechnologyTrends: The next big thing must be Schnoodle

Google - The famous search engine and Internet related technologies
Doodle - Event scheduling
Moodle - A free open source course management system (CMS)
Schnoodle - A Schnoodle is a Poodle hybrid that is a cross-breed of a Poodle and a Schnauzer

Saturday, December 08, 2007

Do Google results make sense?

I did a Google digging research and became puzzled:
I searched for a few Buzz Words and then repeated the search with a combination of them.
The number of entries returned is not in accordance with the known laws of arithmetics!

"Grid Computing" 171,000 entries
where "XXX YYY" means with the exact phrase XXX YYY!
virtualization 1,530,000 entries
"Grid Computing" Virtualization 1,300,000 entries
where space between the terms means with all of the words.
virtualization -"grid computing" 2,580,000 entries
where "-" means exclude the term from the search
-virtualization "grid computing" 4,510,000 entries

In order to verify this mystery I repeated the test with two other terms:

Israel 26,600,000 entries
Jerusalem 4,190,000 entries
Jerusalem Israel 1,120,000 entries
jerusalem -israel 2,150,000 entries
-jerusalem israel 45,300,000 entries

To my understanding the situation can be demonstrated as in the following plot:




Does this mean we should make a compromise in East Jerusalem's territory
(have less Jerusalem and get more Israel) ???


Comments are welcome to shed light on this mystery

Note: If you want to reproduce my test take into account:
1. There may be a small change in the number of entries found. This fluctuation is however negligible.
2. There is another small difference in the number of entries if you try capital letters instead of small letters or change the order of the words.
3. Drawing was produced using the free tool Dia.

Wednesday, November 28, 2007

"Grid Confinement" - A Physicist perspective

The following post should not be taken too seriously!

If Grid Computing is so good but what we see practically is a complete chaos in the Data Center then there must exist an Anti-Grid that cancels all its benefits.
If one tries to make an order and to split the Data Center into two separate sites then immediately Grid - Anti-Grid pairs are produced out of the vacuum and chaos is being restored.

Monday, November 26, 2007

Social Networks and Grid Computing - Some similarities

There are some similarities between Social Networks, e.g. Facebook and Grid Computing.
In the table below I mention a few.
.

Friday, November 23, 2007

Mounting a memory stick under Linux

A short reminder for myself and for all those who are lost when no auto mount works:

1. Verify that the device is recognized in 'dmesg', write down the device name. e.g. /dev/sdb
2. mkdir /disk_on_key
3. mount -t vfat /dev/sdb /disk_on_key
4. at the end un-mount by: umount /disk_on_key

Good luck!

Sunday, November 18, 2007

GSS authentication failure - A question to the gt-users mailing list

I posted today the following question:

I installed Globus version gt4.0.4-x86_64_rhas_4-installer on an Opteron node running CentOS 4.4.
output of uname -a is:
2.6.9-55.0.2.ELsmp #1 SMP Tue Jun 26 14:14:47 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

In the globus-gatekeeper.log there is the following error message:

TIME: Thu Oct 25 18:16:50 2007
PID: 14589 -- Notice: 6: globus-gatekeeper pid=14589 starting at Thu Oct 25 18:16:50 2007

TIME: Thu Oct 25 18:16:50 2007
PID: 14589 -- Notice: 6: Got connection xxx.xxx.xxx.xxx at Thu Oct 25 18:16:50 2007

GSS authentication failure
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Error during delegation: Delegation protocol violation
Failure: GSS failed Major:000d0000 Minor:00000002 Token:00000000

TIME: Thu Oct 25 18:16:50 2007
PID: 14589 -- Failure: GSS failed Major:000d0000 Minor:00000002 Token:00000000

I would appreciate any help resolving this problem

Regards,

Guy Tel-Zur


----------------------------------------------------------------
Then, I received the following reply from Charles Bacon:

That's the sign of a client who disconnected because it didn't trust the gatekeeper. From the gatekeeper's point of view, the disconnect of the client is a violation of protocol. It doesn't indicate anything wrong with your gatekeeper.


Charles

---------------------------------------------------------

Thank you for your reply!

Thursday, November 15, 2007

Software Vulnerability: A new group in Facebook

I created a new group in Facebook to discuss and share knowledge in Software Vulnerability.
I keep there references to relevant links and documents.
You are welcome to join the group: Facebook | Software Vulnerability

My page in Facebook

(Note: In order to access the group you need first to login to your Facebook account)


--Guy

Sunday, November 11, 2007

The 9th edition of PhysicaPlus is on the air



The Israel Physical Society (IPS) on-line magazine, PhysicaPlus, is published a few times a year both in English and in Hebrew and is an excellent source for semi-popular Physics articles.
You are invited to visit PhysicaPlus

Sunday, November 04, 2007

IGT Annual Event & Exhibition

Dec. 3rd, 2007, The Next Generation Data Center

Click on the image for further information:

Wednesday, October 31, 2007

Que es mas macho, BlueGene o EGEE? (*)

I just read in Supercomputingonline about EGEE III, the 3rd phase of the gigantic Grid Computing project:

"EGEE-III will last for 24 months, with a total manpower bid of almost 10,000 person months and an EC budget of Euro 36 million. As with EGEE-II, partners will provide extra effort to the project beyond that funded by the EC, bringing the total project budget up to Euro 70 million, and also contributing a further estimated Euro 50 million worth of computing resources."

and I ask myself what is more justified, a single extended 3PFLOPs BlueGene/P with 884,736-processors or thousands of various kinds of old 32 bit boxes distributed all over the world running Scientific Linux 3.0.X and gLite?

For years we were told that no single data center site will be able to cope with a few PB of data per year that will be produced by the LHC; Nice Powerpoint presentations showed a 20km tower of CDROMs, higher than the Mt. Blanc, indicating that only distributed Grid Computing environment will save us from a catastrophe.

Perhaps it is time to ask, in the spirit of Byron Katie's book Loving What Is, her four questions
  1. Is it true?
  2. Can you absolutely know that it's true?
  3. How do you react when you believe that thought?
  4. Who would you be without the thought?
And please consider this: I still love Grid Computing!


-----------
*Inspired by the lyrics from Laurie Anderson's song, SmokeRings: "Que es mas macho, pineapple o knife?"

Sunday, October 21, 2007

The End of Grid Computing?

In the year 2003 the MIT Technology review ranked "Grid Computing" among the 10 Emerging Technologies That Will Change the World [1].
We are now four years later and something is not going well with "Grid Computing".
An indication that there is a problem can easily be seen by looking at the "Google Trends" plot for the term "Grid Computing":


(click on the image to get the current trend).
This finding can be compared with another buzz word, "Virtualization", which is older than "Grid Computing" and yet is gaining more and more momentum:


There is however one exception. The Academic Grid is still having lot's of glory thanks to the huge heavily funded European (EGEE) and other US projects. When LHC data will start to be taken at CERN it will reach it's top importance. But, it seems that for other scientific projects Grid Computing is not going to be such a success. It will remain as "Nice to have" but will never replace High-Performance Computing (HPC) on one hand and classical distributed computing tools such as Condor [2] which exists for more than 20 years on the other hand.
Once the governmental fundings will be removed then all the hype of the academic Grid Computing will decline very quickly as well.
As was pointed in an interesting talk by Fabrizio Gagliardi about the future of grid computing, at the GridKa07 School, other kinds of Grid Computing infrastructures that will stand on stable financial ground may emerge as the successors, for example Amazon's S3 and EC2 and the joint IBM and Google's cloud computing.



[1] http://www.technologyreview.com/Infotech/13060/page6/

[2] http://www.cs.wisc.edu/condor

Thursday, September 13, 2007

GridKa School, Karlsruhe

I participated in the Grid Computing school, GridKa2007, that took place in Karlsruhe, Germany.

Wednesday, September 12, 2007

If something can be simple it must be complicated in Gentoo

Eclipse version 3.3 does not work out of the box on Sabayon Linux (Gentoo). It likes Java 1.6.
Upon installing this JDK from SUN there was a strange error message that "run-java-tool is not available". Then, I noticed that /usr/bin/java was pointing to /usr/bin/run-java-tool. I know that it was invented with some good reasons but in order to start playing quickly with Eclipse I just removed this link and put instead:
"ln -s /usr/lib/jvm/sun-jdk-1.6/bin/java /usr/bin/java", then Eclipse was happy and me too!!!

Enjoy Eclipse

Sunday, June 24, 2007

Grid-HPC WG Meeting

June 28th, 14:00-16:00 - Grid-HPC Work Group meeting

Location: IGT Offices, Maskit 4, 5th Floor, Hertzliya

Agenda:

14:00-14:15: OPENING - Avner & Guy

14:15-14:50:

DEBUGGING AND OPTIMIZING APPLICATIONS FOR MULTICORE MPP ARCHITECTURES

Speaker:
Jacques Philouze, Vice President Sales & Marketing, Allinea

Abstract:
As two, four and potentially eight-core processors become the norm, the defacto HPC architecture is tending towards large clusters of modest 8-16 core shared-memory servers, potentially with co-processing devices (eg. GPGPUs, FPGAs, Clearspeed). Programming these machines optimally presents a number of challenges, and applications that use a mixed programming models are now becoming commonplace.
In this presentation we will discuss the challenges facing today's HPC application developers, and the need for simple tools that can address mixed programming models. We will present new multicore features of Allinea's Distributed Debugging Tool (DDT) and Optimisation and Profiling Tool (OPT), and discuss our aims to provide a consolidated, scalable, yet intuitive framework for HPC developers .

14:50-15:00: BREAK

15:00-15:35:

FastDL - Cluster Computing with IDL

Timely Visualization and Analysis of Large Data Sets Using IDL and Parallel Computing



Speaker:

Arie Rubin M.Sc.E.E.

IIT-Image Information Technologies (Represents ITT VIS, Boulder Colorado USA)

Anstract:

Scientists exploring fluid and particle dynamics, high-energy and plasma physics, astrophysics and space sciences, biophysics, protein folding and medical science are challenged to visualize and analyze increasingly complex data. With FastDL scientists and developers can run IDL visualization and analyses applications in parallel on cost-effective Linux clusters, significantly shortening the time required to get results. FastDL is comprised of two independent components that address the varying needs of parallel data analysis and visualization applications: TaskDL and mpiDL.TaskDL allows users to run IDL procedures on multiple machines simultaneously by collecting different remote processors together as a task farm. mpiDL provides a Message Passing Interface (MPI) within IDL for synchronizing and passing data between nodes during program execution. GRIDL – Grid Computing with IDL Running parallel IDL applications on a set of nodes communicating over WAN.


15:35-15:45: DISCUSSION AND CONCLUDING REMARKS

To register, please send your contact details to: info@grid.org.il

We are looking forward to seeing you!

Best Regards,

Guy Tel-Zur

Grid-HPC WG Director

IGT

Saturday, June 16, 2007

How to convert data cd/dvd back into an iso file?

Very simple:
type "df" to find the dvd file system name, then:

dd if=/dev/dvd_fs of=mydvd.iso


That's it!

Friday, June 15, 2007

How To Install VMware Server On A Fedora 7 Desktop | HowtoForge - Linux Howtos and Tutorials

VMware on Fedora-7 (F7)

I found the following reference useful when I installed VMware:
How To Install VMware Server On A Fedora 7 Desktop | HowtoForge - Linux Howtos and Tutorials
However, there still was a problem: After defining a new virtual machine I got this error message:

"Unable to change virtual machine power state. The "/usr/lib/vmware/bin/vmware-vmx" process did not start properly...."

Any ideas what to do next????

================================
Later addition:
I think this is because the latest VMware is not ready yet for the advanced F7 and its latest kernel.
What it did was to fall back to CentOS5.

Wednesday, May 30, 2007

Diskless tips - Part 2

1) In order to enable services like NFS mount and ypbind on the diskless clients edit the rc.local file in the image tree on the server like this:
[root@grid02]# more /diskless/centos44_diskless_ver_03/root/etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

#--------added manually by Guy, 30/5/2007:
mount -t nfs 192.168.1.2:/home /home
ypdomainname WRITE_HERE_THE_DOMAIN_NAME
ypbind
ypset 192.168.1.2

2) In order to document the installation procedure, here is the files & directory structure on the server which is relevant for the pxe-boot & tftp stage:
[root@grid02 ~]# ls /tftpboot/
linux-install

[root@grid02 ~]# ls /tftpboot/linux-install/
centos44_diskless_ver_01 centos44_diskless_ver_03 pxelinux.0
centos44_diskless_ver_02 msgs pxelinux.cfg

[root@grid02 ~]# ls /tftpboot/linux-install/centos44_diskless_ver_03
initrd.img vmlinuz

[root@grid01 ~]# ls /tftpboot/linux-install/pxelinux.cfg/
C0A80101 C0A80106 C0A80109 C0A8010C C0A8010F C0A80112 default
C0A80103 C0A80107 C0A8010A C0A8010D C0A80110 C0A80113 pxeos.xml
C0A80104 C0A80108 C0A8010B C0A8010E C0A80111 C0A80114

Typical content of a C0A.... file:
telzur@grid02 pxelinux.cfg]$ more C0A80101
default centos44_diskless_ver_03

label centos44_diskless_ver_03
kernel centos44_diskless_ver_03/vmlinuz
append initrd=centos44_diskless_ver_03/initrd.img root=/dev/ram0 init=disklessrc NFSROOT=192.168.1.2:/diskless/centos44_diskless_ver_03 ramdisk_size=16254
ETHERNET=eth0 NISDOMAIN=(none)

[telzur@grid02 pxelinux.cfg]$ more pxeos.xml





Monday, May 21, 2007

My first test with the IL-BGU EGEE Site

[telzur@cs-grid4 tests]$ voms-proxy-init --voms dteam
Enter GRID pass phrase:
Your identity: /C=IL/O=IUCC/OU=BGU/CN=Guy Tel-Zur
Cannot find file or dir: /home/telzur/.glite/vomses
Creating temporary proxy ........................................................ Done
Contacting lcg-voms.cern.ch:15004 [/C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch] "dteam" Done
Creating proxy .......................................................... Done
Your proxy is valid until Wed May 23 05:27:43 2007
[telzur@cs-grid4 tests]$ voms-proxy-info
subject : /C=IL/O=IUCC/OU=BGU/CN=Guy Tel-Zur/CN=proxy
issuer : /C=IL/O=IUCC/OU=BGU/CN=Guy Tel-Zur
identity : /C=IL/O=IUCC/OU=BGU/CN=Guy Tel-Zur
type : proxy
strength : 512 bits
path : /tmp/x509up_u33335
timeleft : 11:59:55
[telzur@cs-grid4 tests]$ glite-job-submit hello2.jdl

Selected Virtual Organisation name (from proxy certificate extension): dteam
Connecting to host g01.phy.bg.ac.yu, port 7772
Logging to host g01.phy.bg.ac.yu, port 9002


*********************************************************************************************
JOB SUBMIT OUTCOME
The job has been successfully submitted to the Network Server.
Use glite-job-status command to check job current status. Your job identifier is:

- https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg


*********************************************************************************************

[telzur@cs-grid4 tests]$
[telzur@cs-grid4 tests]$ glite-job-status https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg
Current Status: Done (Success)
Exit code: 0
Status Reason: Job terminated successfully
Destination: sbgce1.in2p3.fr:2119/jobmanager-pbs-dteam
Submitted: Tue May 22 17:28:18 2007 IDT
*************************************************************

[telzur@cs-grid4 tests]$ glite-job-output --dir . \ https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg

Retrieving files from host: g01.phy.bg.ac.yu ( for https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg )

*********************************************************************************
JOB GET OUTPUT OUTCOME

Output sandbox files for the job:
- https://g01.phy.bg.ac.yu:9000/ZUMjYMdDBZW4cH9pa9aLyg
have been successfully retrieved and stored in the directory:
/home/telzur/tests/telzur_ZUMjYMdDBZW4cH9pa9aLyg

*********************************************************************************


[telzur@cs-grid4 tests]$ cd telzur_ZUMjYMdDBZW4cH9pa9aLyg/
[telzur@cs-grid4 telzur_ZUMjYMdDBZW4cH9pa9aLyg]$ ls
hw.err hw.out
[telzur@cs-grid4 telzur_ZUMjYMdDBZW4cH9pa9aLyg]$ more ./hw.err
[telzur@cs-grid4 telzur_ZUMjYMdDBZW4cH9pa9aLyg]$ more ./hw.out
"Hello World"
[telzur@cs-grid4 telzur_ZUMjYMdDBZW4cH9pa9aLyg]$

So far it looks GOOD!!!

Tuesday, May 15, 2007

Kernel Virtual Machine is working on my laptop

I followed the instruction from: http://kvm.qumranet.com/kvmwiki/Debian on my DELL420 laptop running Ubuntu 7.04.

Here is a screen shot showing Windows-XP and CentOS5 running as guests Operating Systems (click on the image to enlarge):

Sunday, May 13, 2007

Dapper Drake on a Dell Latitude D420

I used this great link for installing my wireless card on my Dell 420 using Ubuntu 7.04

Dapper Drake on a Dell Latitude D420

Excellent job.

Thanks

Monday, March 12, 2007

IL-BGU home page

Today I updated my IL-BGU portal.
visit: http://www.bgu.ac.il/~tel-zur/grid.html
or click on the map to enter:

Sunday, March 11, 2007

Diskless client installation

In the RHEL documentation (Chapter 4. "Diskless Environments", item #4) there is an in-accuracy in the rsync syntax usage.
Instead of:
rsync -a -e ssh golden_client:/ /diskless/whatever/root/
Use the following syntax which does not produce error messages:
rsync -v -a -e ssh --exclude='/sys/*' --exclude='/proc/*' golden_client:/ /diskless/whatever/root/

Other comments (that I find critial in my experience):

1) Make sure the following RPM is installed in the client: busybox-anaconda
(In my CentOS4.4 case it is: busybox-anaconda-1.00.rc1-5.x86_64.rpm)
2) Make sure there are no active NFS mounted partitions on the client before executing 'rsync'.
3) Make sure that the following 2 directories exist:
/tftpboot/linux-install/msgs/
/tftpboot/linux-install/pxelinux.cfg

and that in the tftp definition file under /etc/xinted.d/tftp the following line is set correctly:
server_args = -s /tftpboot/linux-install

4) If pxelinux.0 does not exist under /tftp/linux-install copy it:
cp /usr/lib/syslinux/pxelinux.0 /tftpboot/linux-install/.

5) Pay attention to the directory hirarchy:
if under /etc/xinetd.d/tftp appears a line, as we wrote above:
server_args = -s /tftpboot/linux-install
then in /etc/dhcpd.conf the reference to pxelinux.0 should be:
filename "pxelinux.0";
and not:
filename "linux-install/pxelinux.0"; or any other variation of it. which will make a failure in booting the diskless nodes.


Friday, March 09, 2007

Howto add Java Plugin to Firefox on Linux

Although this is probably documented in numerous sites, here is my summary:

1) Install the JRE package from SUN (download and then 'chmod u+x' to the *.bin file).
2) cd to /usr/lib/firefox-1.5.0.5/plugins
3) Type:
[root@guydell plugins]# ln -s /usr/java/jre1.5.0_11/plugin/i386/ns7/libjavaplugin_oji.so
[root@guydell plugins]# ls -l
total 36
lrwxrwxrwx 1 root root 58 Mar 9 17:16 libjavaplugin_oji.so -> /usr/java/jre1.5.0_11/plugin/i386/ns7/libjavaplugin_oji.so
-rwxr-xr-x 1 root root 14288 Jul 29 2006 libnullplugin.so
-rwxr-xr-x 1 root root 7564 Jul 29 2006 libunixprintplugin.so

Saturday, February 24, 2007

Invitation to the next IGT HPC Work group meeting



Monday, March 26th, 2007

IGT Offices, Maskit 4, 5th Floor, Hertzliya


Agenda:


14:00-14:15: OPENING - Avner & Guy


14:15-14:50: PPF, JAVA, OPEN-SOURCE AND GRID: BEYOND THE TRADITIONAL GRID

Speaker: Laurent Cohen, ILOG, Inc. and JPPF founder

Abstract:

Traditional Grid architectures rely on a concept of job submission that comes with a set of constraints regarding the nature of the compute nodes, the ease of deployment of the jobs, and the effort required to use the Grid, which could otherwise be utilized to work on the problems to solve. JPPF offers an alternative, enabling a true heterogeneous nature of the Grid components, an ease of use that permits engineers, developers and scientists to focus on their domain rather than on the grid infrastructure, while retaining the benefits of the Grid technology to solve heavy and complex computational problems.
We will present how the design and architectural choices in JPPF, in terms of programming language, installation, administration, dynamic configuration, updates automation, application code deployment and security policy automation, bring outstanding benefits in the areas of the cost of adoption, level of effort at the operational and organizational levels, as well as the resulting ease of use for the end-users.


14:50-15:00: BREAK


15:00-15:35: INTERACTIVE, USER FRIENDLY, PARALLEL COMPUTING ON CLUSTERS

Speaker: Yoel Jacobsen/E&M CTO
Abstract:
Productivity boost brought by Star-P to the teams of MATLAB-skilled domain experts is matched by the economic benefits of getting the most out of the computing power of next generation servers and clusters. Star-P enables interactive workflow for large-scale scientific and engineering computing, eliminating the need for intermediate steps of reprogramming the code in C, Fortran, and MPI, and dramatically shortening the time to insight.


15:35-15:45: DISCUSSION AND CONCLUDING REMARKS


To register, please send your contact details to info@grid.org.il


We are looking forward to seeing you!


Best Regards,


Guy Tel-Zur

Grid-HPC WG Director

IGT


www.Grid.org.il

Monday, January 22, 2007

Summary of the IGT HPC Work group first year

The following two slides give a perspective of the IGT HPC Work group activity during its first year.
There were 4 meetings with an average number of about 20 attendees.

If your company/research group would like to give a presentation in this forum please contact me.

Sunday, January 07, 2007

OpenVPN my first trial

This note describes establishing VPN connection between my laptop and my remote Linux server.
I was inspired by the first example in "Casting Your Net with OpenVPN" by Paul Duncan article that was published in the Linux Magazine.

First I created configuration files at both ends, see screen dump for the client configure file:

Then I started OpenVPN on both computers:

and similarly at the remote server:

After establishing the connection, I could connect via my private network:
I did SSH from my client (10.55.55.2) to my server (10.55.55.1):