Archive

Author Archive

Parallel multi process bash with return codes

August 12th, 2010 thughes No comments

Have you ever needed to run a bunch of long running processes from a bash script and get their return codes ? I come across this issue quite frequently in my line of work. The most common one is where i need to run rsync to collect files from many machines then if successful run some other task. Depending on the amount of servers and data this can take several hours to run sequentially and I don’t really like waiting around to check the output so that I can run the next task.

How to speed it up? The obvious way would to be to background the rsysc commands but then I dont know if they were all successful. What if one fails? How would I know which one? Some how I needed to catch the return codes of all the sub-shells and be able to match them to a command. This is where the bash command wait come into play.

~]$ help wait
wait: wait [id]
Wait for job completion and return exit status.

Waits for the process identified by ID, which may be a process ID or a
job specification, and reports its termination status. If ID is not
given, waits for all currently active child processes, and the return
status is zero. If ID is a a job specification, waits for all processes
in the job’s pipeline.

Exit Status:
Returns the status of ID; fails if ID is invalid or an invalid option is
given.

The idea is to collect the PID of each sub-shell using the $! variable and adding it to a list. Then use the wait command to wait for each sub-shell to finish and exit with the return code of the sub-shell command. By adding these return codes to another list then we can iterate over them and match them up with the original list. We split the original list into an array towards the end so we can reference the individual items by an index.

In the following example I have used wget and also deliberately changed testfile03.txt to testfile06.txt to show an example of a non 0 return code.

testfile01
testfile02
testfile03
testfile04

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash
my_list="
http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile01.txt
http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile02.txt
http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile06.txt
http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile04.txt
"
 
results=''
pids=''
for X in $my_list ; do
    wget -o /dev/null $X &
    pid=$!
    pids="$pids $pid"
done 
 
for pid in $pids
do
    wait $pid
    result=$?
    results="$results $result"
done
 
echo $results
 
i=0
my_array=( $my_list )
for ret_val in $results; do
    echo ${my_array[$i]} returned $ret_val
    ((i++))
done
Categories: Bash, Linux, Sysadmin Tags:

Installing Thrift on Fedora 13 (Goddard)

July 22nd, 2010 thughes No comments

From Thrift Site – “Thrift is a software framework for scalable cross-language services development. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and OCaml.”

When I initially tried to install thrift I had some difficulty because of an error which follows. This error makes you think you need something to do with boost but I am pretty sure the issue was that I was missing gcc-c++

checking for boostlib >= 1.33.1... configure: error: We could not detect the boost libraries (version 1.33 or higher). If you have a staged boost library (still not installed) please specify $BOOST_ROOT in your environment and do not give a PATH to --with-boost option. If you are sure you have boost installed, then check your version number looking in <boost/version.hpp>. See http://randspringer.de/boost  for more documentation.

I did this on a ‘Desktop’ install on my laptop so this may not be every package required. You will need to install at least the following packages.

yum -y install autocon automake libtool flex \
 boost-devel gcc-c++ perl-ExtUtils-MakeMaker byacc

Download the thrift from subversion to get the latest and greatest.

svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift
cd thrift

Then, following the README, you need to do the following.

./bootstrap.sh
 
./configure
 
make
 
sudo make install

And that appears to be it. If you find any issues please leave a comment and I will try on an @base install of Fedora to iron out any other dependencies.

Categories: Uncategorized Tags: ,

Stunnel in client mode

January 22nd, 2010 thughes No comments

Stunnel is a quick way on taking a non ssl connection and being able to wrap it in ssl for security

stunnel version 4 – Fedora 12/RHEL 5.3 /Centos 5.3

vim /etc/stunnel/stunnel.conf

add in

client=yes
[gmail]
accept  = 127.0.0.1:50000
connect = mail.google.com:443

then run

stunnel

stunnel version 3 – Ubuntu 8.10 (I haven’t used newer versions)

Ubuntu 8.10 has 2 versions of stunnnel: stunnel3 and stunnel4. They have created a symbolic link from /usr/bin/stunnel -> /usr/bin/stunnel3

If you would like to use version 4 you can use the command stunnel4 otherwise if you wish to use the default version, you will need to create a self signed certificate

openssl req -new -x509 -days 3650 -nodes -out /etc/stunnel/stunnel.pem -keyout /etc/stunnel/stunnel.pem

Then to start stunnel use the following command

stunnel -c -d localhost:50001 -r mail.google.com:443
Categories: Sysadmin Tags:

Delete single line from file

January 18th, 2010 thughes No comments

I quite often need to remove a single line from a file by its line number. The most common use case for me is the known_hosts file when I have reinstalled a system, I have in the past used vim and navigated to the line then removed it. This is all well and good but it gets to be a pain having to do it repeatedly, especially when you manage around 1000 servers and the get rebuilt frequently. Finally today I had had enough so wrote a little script to do this task easily. Hopefully someone else finds this useful

Its usage is : delline LINE FILE

#!/bin/bash
LINE=$1
FILE=$2
if [ ! -f $FILE ] ; then
    echo "can't read $FILE: No such file or directory"
    exit 1
fi
if [ `expr $LINE + 1 2> /dev/null` ] ; then
    sed -i "${LINE}d" $FILE 
else
    echo $LINE is not numeric 
    exit 1
fi
Categories: Linux, Sysadmin Tags: ,

Reinstall CentOS using grub

December 31st, 2009 thughes No comments

This post is here mainly because I always forget how to do it. This is one of the simplest ways to reinstall a Centos (will probably work for RHEL and maybe even Fedora) system without needing PXE or physical access to the machine. Make sure that that you have tested you kickstart before you use it and don’t blame me if anything goes wrong.

Save the following script and make it executable then run it. It will ask some questions about networking and hostname and then write a new grub stanza to you grub.conf. It will also download the correct kernel and initrd from the information you have given it and put them in the correct position for grub to find them when it boots.

When you reboot you should be able to select Kickstart Centos and it will boot off the new kernels and pull down the kickstart then reinstall.

#!/bin/bash -x

echo -n "Enter kickstart url: "
read -e ksurl

echo -n "Enter Hostname: "
read -e hostname
echo -n "Enter IP Address: "
read -e ipaddr
echo -n "Enter Gateway: "
read -e gateway
echo -n "Enter Netmask: "
read -e netmask
echo -n "Enter Nameservers: "
read -e nameservers

repourl=$(curl $ksurl 2>/dev/null | sed -n 's/.*\(http\)/\1/ p')
#echo $repourl
vmlinuz_url="${repourl}/isolinux/vmlinuz"
initrd_url="${repourl}/isolinux/initrd.img"

date_now=$(date +%Y%m%d%H%M%S)

grub_stanza="
title Kickstart Centos 5 ${date_now}
        root (hd0,0)
        kernel /reinstall/vmlinuz ksdevice=eth0 load_ramdisk=1 prompt_ramdisk=0 ramdisk_size=16384 serial hostname=${hostname} ip=${ipaddr} gateway=${gateway} netmask=${netmask} dns=${nameservers} noipv6 ks=${ksurl}
        initrd /reinstall/initrd.img
"

echo "$grub_stanza"

echo -n "Please check the grub stanza above and enter 'y' if it is correct: "
read -e confirmed

if [ $confirmed == 'y' ]; then
        echo "Downloading kernel and initrd..."
        mkdir -p /boot/reinstall
        (cd /boot/reinstall/;/usr/bin/urlgrabber $vmlinuz_url )
        (cd /boot/reinstall/;/usr/bin/urlgrabber $initrd_url )
        cp /boot/grub/grub.conf /boot/grub/grub.conf.bak_`date +%Y%m%d%H%M%S`
        echo "$grub_stanza" >> /etc/grub.conf
fi
Categories: Centos, Linux, Sysadmin Tags: , , , ,

Python HTTPConnection bound to network interface

May 27th, 2009 thughes 1 comment

The web server I use at work are multi homed with the default route being the internal management network. We came across an issue where we wanted make a XMLHTTPRequest for a data feed from another company into our web app. We all know due to cross-site scripting attacks this is no longer possible so we had to write a little proxy script to pull the data and serve it from our own site. The standard python httplib doesn’t have the ability to bind to a specific interface so I have done a little sub-classing and now have a HTTPConnection which allows me to bind to a specific interface. Hope this helps someone as from my searching it seems to be a common request. You will meed to change the IP address to match your setup :-)

import httplib
import socket
 
 
 
class HTTPConnectionInterfaceBound(httplib.HTTPConnection):
    """This class allows communication via a bound interface for 
       multi network interface machines."""
 
    def __init__(self, host, port=None, strict=None, bindip=None):
        httplib.HTTPConnection.__init__(self, host, port, strict)
        self.bindip = bindip
 
 
    def connect(self):
        """Connect to the host and port specified in __init__."""
        msg = "getaddrinfo returns an empty list"
        for res in socket.getaddrinfo(self.host, self.port, 0,
                                      socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            try:
                self.sock = socket.socket(af, socktype, proto)
                if self.debuglevel > 0:
                    print "connect: (%s, %s)" % (self.host, self.port)
                if self.bindip != None :
                    self.sock.bind ((self.bindip, 0))
                self.sock.connect(sa)
            except socket.error, msg:
                if self.debuglevel > 0:
                    print 'connect fail:', (self.host, self.port)
                if self.sock:
                    self.sock.close()
                self.sock = None
                continue
            break
        if not self.sock:
            raise socket.error, msg
 
 
 
conn = HTTPConnectionInterfaceBound('www.thegoldfish.org', 80, bindip='192.168.56.83')
conn.request("GET", "/")
r1 = conn.getresponse()
print r1.status, r1.reason
print r1.read()
Categories: Python Tags: ,

Using external scripts with django models

May 5th, 2009 thughes No comments

I have used a few web frameworks over the years but I think I have finally found the one that suits my particular needs. I have played with RoR, Turbo Gears, Catalyst and a couple of others but none have actually made me want to write code instead of hoping that it allows me to write less. That was until I discovered Django. A friend of mine had said he was using for his website it but for some reason I managed to get it stuck in my head that he was using Mambo CMS so I never really paid it much attention.

Something that I usually need in a web application is some way to interact with it from the command line, I am a systems administrator – CLI is what I do. In the past I have just connected directly to the database from my scripts but it has always bothered my that I have already done all this business logic in my models and I end up having to repeat it in a limited fashion. I have had a brief look around a couple of times but never found anything that works to my satisfaction until I found the following code on this site by James Bennett http://www.b-list.org/weblog/2007/sep/22/standalone-django-scripts/. James has several methods on that page but I think that this one will be best if you plan on distributing your application.

import os
from optparse import OptionParser
 
usage = "usage: %prog -s SETTINGS | --settings=SETTINGS"
parser = OptionParser(usage)
parser.add_option('-s', '--settings', dest='settings', metavar='SETTINGS',
                          help="The Django settings module to use")
(options, args) = parser.parse_args()
if not options.settings:
    parser.error("You must specify a settings module")
 
    os.environ['DJANGO_SETTINGS_MODULE'] = options.settings
 
LOG_FILENAME = 'logging.log'
logging.basicConfig(filename=LOG_FILENAME,level=logging.INFO,)

The issue with this was after a while I realised that it didn’t work unless you were in the base directory of your application, this may be because I was in doing something wrong – I am relatively new to python. Then I thought how does Django do it ? The answer to this lived in the core.management.__init__ method setup_environ(). Anyway I ended up using a mashup of the two concepts to arrive at the following code. It is a bit long winded but it works from any directory so the script can be anywhere. It sets the environment up based on the settings file being in the normal place – in the base directory of the application.

import os
import sys
from optparse import OptionParser
 
# Parse the command line options
usage = "usage: %prog -s SETTINGS | --settings=SETTINGS"
parser = OptionParser(usage)
parser.add_option('-s', '--settings', dest='settings', metavar='SETTINGS',
                          help="The Django settings module to use")
(options, args) = parser.parse_args()
if not options.settings:
    parser.error("You must specify a settings module")
 
# set
project_directory, settings_filename = os.path.split(options.settings)
if project_directory == os.curdir or not project_directory:
    project_directory = os.getcwd()
project_name = os.path.basename(project_directory)
settings_name = os.path.splitext(settings_filename)[0]
sys.path.append(os.path.join(project_directory, os.pardir))
project_module = __import__(project_name, {}, {}, [''])
sys.path.pop()
 
os.environ['DJANGO_SETTINGS_MODULE'] = '%s.%s' % (project_name, settings_name)

With that added at the top of my scripts I can keep may maintenance scripts in a place more logical like /usr/local/bin instead of the project root and run them like this

/usr/local/bin/myscript.py --settings=/path/to/project/settings.py
Categories: Django, Python Tags: ,

libvirt-qpid and python

May 2nd, 2009 thughes No comments

Libvirt is fast becoming the standard tool for managing virtual machines on Linux and Qpid is the Apache foundations new implementation of AMQP which is the first open standard for Enterprise Messaging. These two technologies have the potential to work in well together for large virtualization installations and luckily for us the good guys in the libvirt team have done just that http://libvirt.org/qpid/ but there are currently very few examples on how to use it. I am putting this brief tutorial in their wiki as a starting point for others but will continue to publish my experiences here.

Installation

libvirt-qpid is currently available in Fedora 10 repositories so you can install it using yum

yum -y install libvirt-qpid qpidd python-qpid
chkconfig libvirt-qpid on
chkconfig qpidd on
service libvirt-qpid start
service qpidd start

Testing that it is running

We can check that it is running using ”qpid-tool” and the list command

# qpid-tool
Management Tool for QPID
qpid: list
Management Object Types:
ObjectType                 Active  Deleted
============================================
com.redhat.libvirt:domain  6       0
com.redhat.libvirt:node    1       0
com.redhat.libvirt:pool    1       0

Simple client in python

Now that we have it running lets make a simple client to get information from it. To do this I use python. The following is a simple script that does some of the basics

#!/usr/bin/env python
 
from qmf.console import Session
from yaml import dump
 
sess = Session() # defaults to synchronous-only operation. It also defaults to user-management of connections.
 
# attempt to connect to a broker
try:
    broker = sess.addBroker('amqp://localhost:5672')
    print "Connection Success"
except:
    print "Connection Failed"
 
domains = sess.getObjects(_class='domain', _package='com.redhat.libvirt.domain')
 
# Print a list of the domains
for d in domains:
    print d
 
# Select the first domain
domain = domains[0]
 
# Print a list of the properties of the domain
print 'Properties:'
props = domain.getProperties()
for prop in props:
    print "\t",prop
 
# Access a value of a property and print it
print domain.name
 
# Print a list of the methods of the domain
print 'Methods:'
meths = domain.getMethods()
for meth in meths:
    print "\t",meth
 
# Ca method of the domain and print it
xmldesc =  domain.getXMLDesc()
 
# Call another method of the domain and print the result
if domain.state == 'running':
    result = domain.shutdown()
    print result
else:
    result = domain.create()
    print result
 
# Disconnect from the broker (otherwise we hang the terminal
sess.delBroker(broker)

Links

http://qpid.apache.org/qmf-python-console-tutorial.html

Categories: AMQP, Libvirt, Python Tags: