Friday, June 24, 2005

SCP without passwords

http://www.justenoughlinux.com/2004/04/14/scp_without_passwords.html
http://www.linuxgazette.com/node/193


In this article I'll show you how to use scp without passwords. Then I'll show you how to use this in two cool scripts. One script lets you copy a file to multiple linux boxes on your network and the other allows you to easily back up all your linux boxes.

If you're a linux sysadmin, you frequently need to copy files from one linux box to another. Or you need to distribute a file to multiple boxes. You could use ftp, but there are many advantages to using scp instead. Scp is much more secure than ftp, as scp travels across the LAN /WAN encrypted, while ftp uses clear text (even for passwords.

But what I like best about scp is that it's easily scriptable. Suppose you have a file that you need to distribute to 100 linux boxes. I'd rather write a script to do it than type 100 sets of copy commands. If you use ftp in your script it can get pretty messy, because each linux box you log into is going to ask for a password. But if you use scp in your script, you can set things up so the remote linux boxes don't ask for a password. Believe it or not, this is actually much more secure than using ftp!

Here's an example demonstrating the most basic syntax for scp. To copy a file named 'abc.tgz' from your local pc, to the /tmp dir of a remote pc called 'bozo' use:

scp abc.tgz root@bozo:/tmp

You will now be asked for bozo's root password. So we're not quite there yet. It's still asking for a password so it's not easily scriptable. To fix that, follow this one time procedure (then you can do endless "passwordless" scp copies):

1. Decide which user on the local machine will be using scp later on. Of course root gives you the most power, and that's how I personally have done it. I won't give you a lecture on the dangers of root here, so if you don't understand them, use a different user. Whatever you choose, log in as that user now for the rest of the procedure, and log in as that user when you use scp later on.

2. Generate a public / private key pair on the local machine. Say What? If you're not familiar with Public Key Cryptography, here's the 15 second explanation. In Public Key Cryptography, you generate a pair of mathematically related keys, one public and one private. Then you give your public key to anyone and everyone in the world, but you never ever give out your private key. The magic is in the mathematical makeup of the keys - anyone with your public key can encrypt a message with it, but only you can decrypt it with your private key. Anyway, the syntax to create the key pair is:

ssh-keygen -t rsa

3. In response you'll see:
"Generating public/private rsa key pair"
"Enter file in which to save the key ... "
Just hit enter to accept this.

4. In response you'll see:
"Enter passphrase (empty for no passphrase):"
You don't need a passphrase, so just hit enter twice.

5. In response you'll see:
"Your identification has been saved in ... "
"Your public key has been saved in ... "
Note the name and location of the public key just generated (it will always end in .pub).

6. Copy the public key just generated all your remote linux boxes. You can use scp or ftp or whatever to do the copy. Assuming your're using root (again see my warning in step 1. above), the key must be contained in the file /root/.ssh/authorized_keys (watch spelling!). Or if you are logging in as a user, e.g. clyde, it would be in /home/clyde/authorized_keys. Note that the authorized_keys file can contain keys from other PC's. So if the file already exists and contains text in it, you need to append the contents of your public key file to it.

That's it. Now with a little luck you should be able to scp a file to the remote box without using a password. So let's test it by trying our first example again. Copy a file named 'xyz.tgz' from your local pc, to the /tmp dir of a remote pc called 'bozo'

scp xyz.tgz root@bozo:/tmp

Wow !!! It copied with no password!!

A word about security before we go on. This local PC just became pretty powerful, since it now has access to all the remote PC's with only the one local password. So that one password better be very strong and well guarded.

Now for the fun part. Let's write a short script to copy a file called 'houdini' from the local PC to the /tmp dir of ten remote PC's, in ten different cities (with only 5 minutes work). Of course it would work just the same with 100 or 1000 PC's. Suppose the 10 PC's are called: brooklyn, oshkosh, paris, bejing, winslow, rio, gnome, miami, minsk and tokyo. Here's the script:

#!/bin/sh
for CITY in brooklyn oshkosh paris bejing winslow rio gnome miami minsk tokyo
do
scp houdini root@$CITY:/tmp
echo $CITY " is copied"
done

Works liek magic. With the echo line in the script you should be able to watch as each city is completed one after the next.

By the way, if you're new to shell scripting, here's a pretty good tutorial:
http://www.freeos.com/guides/lsst/.

As you may know, scp is just one part of the much broader ssh. Here's the cool part. When you followed my 6 stop procedure above, you also gained the ability sit at your local PC and execute any command you like on any of the remote PC's (without password of course!). Here's a simple example, to view the date & time on the remote PC brooklyn:

ssh brooklyn "date"

Now let's put these 2 concepts together for one final and seriously cool script. It's a down and dirty way to backup all your remote linux boxes. The example backs up the /home dir on each box. It's primitive compared to the abilities of commercial backup software, but you can't beat the price. Consider the fact that most commercial backup software charges licence fees for each machine you back. If you use such a package, instead of paying licence fees to back remote 100 PC's, you could use the script back the 100 PC's to one local PC. Then back the local PC to your commercial package and save the license fee for 99 PC's ! Anyway the script demostates the concepts so you can write you own to suit your situation. Just put this script in a cron job on your local PC (no script is required on the remote PC's). Please read the comments carefully, as they explain everything you need to know:

#!/bin/sh

# Variables are upper case for clarity

# before using the script you need to create a dir called '/tmp/backups' on each
# remote box & a dir called '/usr/backups' on the local box

# on this local PC
# Set the variable "DATE" & format the date cmd output to look pretty
#
DATE=$(date +%b%d)

# this 'for loop' has 3 separate functions

for CITY in brooklyn oshkosh paris bejing winslow rio gnome miami minsk tokyo
do

# remove tarball on remote box from the previous time the script ran # to avoid filling up your HD
# then echo it for troubleshooting
#
ssh -1 $CITY "rm -f /tmp/backups/*.tgz"
echo $CITY " old tarball removed"

# create a tarball of the /home dir on each remote box & put it in /tmp/backups
# name the tarball uniquely with the date & city name
#
ssh $CITY "tar -zcvpf /tmp/backups/$CITY.$DATE.tgz /home/"
echo $CITY " is tarred"

# copy the tarball just create from the remote box to the /usr/backups dir on
# the local box
#
scp root@$CITY:/tmp/backups/$CITY.$DATE.tgz /usr/backups
echo $CITY " is copied"

done

# the rest of the script is for error checking only, so it's optional:

# on this local PC
# create error file w todays date.
# If any box doesn't get backed, it gets written to this file
#
touch /u01/backup/scp_error_$DATE

for CITY in brooklyn oshkosh paris bejing winslow rio gnome miami minsk tokyo

do

# Check if tarball was copied to local box. If not write to error file
# note the use of '' which says do what's after it if what's before it is not # true
#
ls /u01/backup/$CITY.$DATE.tgz echo " $CITY did not copy" >> scp_error_$DATE

# Check if tarball can be opened w/o errors. If errors write to error file.
tar ztvf /u01/backup/$CITY.$DATE.tgz echo "tarball of $CITY is No Good" >> scp_error_$DATE

done

That's about it. In this article I've tried to give examples that demonstate the concepts, not necessarily to be use "as is". Some of the syntax may not work in all distros, but in the interest of brevity I could not include all the possibilities. For example, if you are using Red Hat 6.2 or before, the syntax will require some changes (I'd be happy to give it to you if you email me). So be creative and hopefully you can use some of this in your own environment.

Optimizing High Traffic Servers

Here is some very valuable information that I'm archiving.

Original URL: http://www.crucialparadigm.com/resources/tutorials/server-administration/optimize-tweak-high-traffic-servers-apache-load.php

Focus: Linux, Apache 1.3+, [PHP], [MySQL]
Notes: Use at your own risk. If this has any errors, please let me know and I will correct them.

Summary
If you are reaching the limits of your server running Apache serving a lot of dynamic content, you can either spend thousands on new equipment or reduce bloat to increase your server capacity by anywhere from 2 to 10 times. This article concentrates on important and poorly-documented ways of increasing capacity without additional hardware.

Problems
There are a few common things that can cause server load problems, and a thousand uncommon. Let's focus on the common:
Drive Swapping - too many processes (or runaway processes) using too much RAM
CPU - poorly optimized DB queries, poorly optimized code, runaway processes
Network - hardware limits, moron attacks

Solutions: The Obvious
Briefly, and for completeness, here are the most obvious solutions:

Use "TOP" and "PS axu" to check for processes that are using too much CPU or RAM.
Use "netstat -anp sort -u" to check for network problems.

Solutions: Apache's RAM Usage
First and most obvious, Apache processes use a ton a RAM. This minor issue becomes a major issue when you realize that after each process has done its job, the bloated process sits and spoon-feed data to the client, instead of moving on to bigger and better things. This is further compounded by a bit of essential info that should really be more common knowledge:

If you serve 100% static files with Apache, each httpd process will use around 2-3 megs of RAM.
If you serve 99% static files & 1% dynamic files with Apache, each httpd process will use from 3-20 megs of RAM (depending on your MOST complex dynamic page).

This occurs because a process grows to accommodate whatever it is serving, and NEVER decreases again unless that process happens to die. Quickly, unless you have very few dynamic pages and major traffic fluctuation, most of your httpd processes will take up an amount of RAM equal to the largest dynamic script on your system. A smart web server would deal with this automatically. As it is, you have a few options to manually improve RAM usage.

Reduce wasted processes by tweaking KeepAlive
This is a tradeoff. KeepAliveTimeout is the amount of time a process sits around doing nothing but taking up space. Those seconds add up in a HUGE way. But using KeepAlive can increase speed for both you and the client - disable KeepAlive and the serving of static files like images can be a lot slower. I think it's best to have KeepAlive on, and KeepAliveTimeout very low (like 1-2 seconds).

Limit total processes with MaxClients
If you use Apache to serve dynamic content, your simultaneous connections are severely limited. Exceed a certain number, and your system begins cannibalistic swapping, getting slower and slower until it dies. IMHO, a web server should automatically take steps to prevent this, but instead they seem to assume you have unlimited resources. Use trial & error to figure out how many Apache processes your server can handle, and set this value in MaxClients. Note: the Apache docs on this are misleading - if this limit is reached, clients are not "locked out", they are simply queued, and their access slows. Based on the value of MaxClients, you can estimate the values you need for StartServers, MinSpareServers, & MaxSpareServers.

Force processes to reset with MaxRequestsPerChild
Forcing your processes to die after a while makes them start over with low RAM usage, and this can reduce total memory usage in many situations. The less dynamic content you have, the more useful this will be. This is a game of catch-up, with your dynamic files constantly increasing total RAM usage, and restarting processes constantly reducing it. Experiment with MaxRequestsPerChild - even values as low as 20 may work well. But don't set it too low, because creating new processes does have overhead. You can figure out the best settings under load by examining "ps axu --sort:rss". A word of warning, using this is a bit like using heroin. The results can be impressive, but are NOT consistent - if the only way you can keep your server running is by tweaking this, you will eventually run into trouble. That being said, by tweaking MaxRequestsPerChild you may be able to increase MaxClients as much as 50%.

Apache Further Tweaking
For mixed purpose sites (say image galleries, download sites, etc.), you can often improve performance by running two different apache daemons on the same server. For example, we recently compiled apache to just serve up images (gifs,jpegs,png etc). This way for a site that has thousands of stock photos. We put both the main apache and the image apache on the same server and noticed a drop in load and ram usage. Consider a page had about 20-50 image calls -- the were all off-loaded to the stripped down apache, which could run 3x more servers with the same ram usage than the regular apache on the server.


Finally, think outside the box: replace or supplement Apache

Use a 2nd server
You can use a tiny, lightning fast server to handle static documents & images, and pass any more complicated requests on to Apache on the same machine. This way Apache won't tie up its multi-megabyte processes serving simple streams of bytes. You can have Apache only get used, for example, when a php script needs to be executed. Good options for this are:

TUX / "Red Hat Content Accelerator" - http://www.redhat.com/docs/manuals/tux/
kHTTPd - http://www.fenrus.demon.nl/
thttpd - http://www.acme.com/software/thttpd/

Try lingerd
Lingerd takes over the job of feeding bytes to the client after Apache has fetched the document, but requires kernel modification. Sounds pretty good, haven't tried it. lingerd - http://www.iagora.com/about/software/lingerd/

Use a proxy cache
A proxy cache can keep a duplicate copy of everything it gets from Apache, and serve the copy instead of bothering Apache with it. This has the benefit of also being able to cache dynamically generated pages, but it does add a bit of bloat.

Replace Apache completely
If you don't need all the features of Apache, simply replace it with something more scalable. Currently, the best options appear to be servers that use a non-blocking I/O technology and connect to all clients with the same process. That's right - only ONE process. The best include:

thttpd - http://www.acme.com/software/thttpd/
Caudium - http://caudium.net/index.html
Roxen - http://www.roxen.com/products/webserver/
Zeus ($$) - http://www.zeus.co.uk

Solutions: PHP's CPU & RAM Usage
Compiling PHP scripts is usually more expensive than running them. So why not use a simple tool that keeps them precompiled? I highly recommend Turck MMCache. Alternatives include PHP Accelerator, APC, & Zend Accelerator. You will see a speed increase of 2x-10x, simple as that. I have no stats on the RAM improvement at this time.

Solutions: Optimize Database Queries
This is covered in detail everywhere, so just keep in mind a few important notes: One bad query statement running often can bring your site to its knees. Two or three bad query statements don't perform much different than one. In other words, if you optimize one query you may not see any server-wide speed improvement. If you find & optimize ALL your bad queries you may suddenly see a 5x server speed improvement. The log-slow-queries feature of MySQL can be very helpful.

How to log slow queries:

# vi /etc/rc.d/init.d/mysqld

Find this line:
SAFE_MYSQLD_OPTIONS="--defaults-file=/etc/my.cnf"

change it to:
SAFE_MYSQLD_OPTIONS="--defaults-file=/etc/my.cnf --log-slow-queries=/var/log/slow-queries.log"

As you can see, we added the option of logging all slow queries to /var/log/slow-queries.log
Close and save mysqld. Shift + Z + Z

touch /var/log/slow-queries.log
chmod 644 /var/log/slow-queries.log

restart mysql
service myslqd restart
mysqld will log all slow queries to this file.

References
These sites contain additional, more well known methods for optimization.

Tuning Apache and PHP for Speed on Unix - http://php.weblogs.com/tuning_apache_unix
Getting maximum performance from MySQL - http://www.f3n.de/doku/mysql/manual_10.html
System Tuning Info for Linux Servers - http://people.redhat.com/alikins/system_tuning.html
mod_perl Performance Tuning (applies outside perl) - http://perl.apache.org/docs/1.0/guide/performance.html

Once again, if this has any errors or important omissions, please let me know and I will correct them.
If you experience a capacity increase on your server after trying the optimizations, let me know!

Article written by spagmoid additions by: albo,huck on the ev1 forums.