UNIVERSITY of VIRGINIA
Computer Science
Research  Teaching  People  Community   

Search
Directory
Contact Us
 
Undergrads • Grad Students • Faculty • Staff • Alumni • Locator • Phones

Answers from Paco

Kevin J. Prey writes:
I was wondering if there was a way to require passwords to access html pages that are on the department server (other than starting my own web server).

Take a look at /home/paco/public_html/secure. It's a demo page for password-based access on our servers. The important files are:
.htaccess Determines what kind of restrictions are imposed
.htpasswd Contains userids and encrypted passwords
.htgroup Contains groups and membership info

The htpasswd command will help you make .htpasswd files.

If you go to http://www.apache.org and look in their modules documentation, there's a "mod_auth" module which controls access. That web page tells you about how to control access.

Here's my bookmark to it:
http://www.rge.com/pub/infosystems/apache/docs/mod/mod_auth.html

If you look around, you'll find lots and lots of other cool things that .htaccess files can do. (do "find /home/paco/public_html -name .htaccess -print". That will find all my .htaccess files. Look at what they do to their respective pages.)

This should be an FAQ that is available from our main web pages. I've CC'ed the web team on this so that one of them can do a little research and perhaps produce a step-by-step guide.

If my examples aren't enough, let me know. I can explain it in more detail.



I want to go through all of the files under /helpnet and do a replace of text. How can I do this best?
My favorite is a for loop and sed. Here's a quick example:
  for i in `find . -name '*.raw'`; do
cp $i $i.temp
sed -e 's/old text/new text/g' < $i.temp > $i
rm $i.temp
echo $i updateed.
done

Line by line, that is pretty much:

Loop through all files in the current directory and below matching *.raw
Make a copy of blah.raw as blah.raw.temp
Copy-and-replace from blah.raw.temp back into blah.raw
Delete blah.raw.temp
Echo a status message

Sed is worth a couple of extra hints... Alphanumeric characters and spaces are okay. If you start throwing in []./\!$^ characters, expect trouble. I'll be happy to explain them if you want... their behavior is a lot like in PERL and grep, if you understand how regexs are used there.

It might not be a bad idea to omit the 'rm $i.temp' line so that you have backup copies; just delete them when you're pretty sure the operation worked. It's a little too easy to clobber an entire directory this way. :)


Thanks. Okay, I want to replace a string with a URL ( with / 's ) in it with another string. How so?
Escape them. \ is the escape character. / is a string delimiter, as in s/first/second/g. Dot is a wildcard and matches any one character, so that has to be escaped, too. So, for example, to do this substitution
http://www.vt.edu/ -> http://www.virginia.edu/
your sed string would be
sed -e 's/http:\/\/www\.vt\.edu\//http:\/\/www\.virginia\.edu\//g'
Oh, yeah, two other important character to use with caution: * and +. * means '0 or more of the preceeding character.' So 'Q*' matches '' and 'QQQQQ', etc. + means '1 or more of the preceeding character.' --Patrick



Do you know of any methods that I could use to determine who owns a specific domain name and to find out information based on the domain name and the IP address?
The only thing I know how to do is:

whois -h whois.internic.net coolsavings.com

Just to add to this:

A lot of new domain and IP registrations are occurring on whois.arin.net, so sometimes you may have to use:
whois -h whois.arin.net coolsavings.com

Some are also occurring on whois.ripe.net (Europe), whois.apnic.net (Asia), and whois.nic.mil (military).

You can also find out who owns an IP address in a couple different ways.
The easiest i nslookup 135.24.35.46 but that doesn't work if the person doesn't have a reverse-lookup entry on their nameserver. You can use 'whois' here, too:
whois -h whois.internic.net 135.24.0.0

The trick is knowing how many zeros to put. If the first digit is 1-127, use three zeros. If it's 128-191, use two. If it's 192-255, use one. All the IP blocks I checked are registered with arin.net (not internic.net) now. Examples:

3.16.16.14 -> 3.0.0.0 (arin.net says this is GE)
128.143.150.79 -> 128.143.0.0 (arin.net says this is UVA)
206.205.42.253 -> 206.205.42.0 (arin.net says this is Cornerstone)
--Patrick



/usr/cs/contrib/bin/giftool -h will tell you the following:

Usage: giftool [options] [file]
       giftool (-p|-c|-B) [options] [files...]
        -B      Batch Mode, read and write the same filename
        -i      Set GIF Interlace mode ON
        +i      Set GIF Interlace mode OFF
        -p      Print information about file(s)
        -c      Print comment information
        +c      Add comments to file(s)
        -C      Strip comment from file(s)
        -o file Send output to 'file'
        -rgb name       Use 'name' as the transparent pixel
        -rgb ##,##,##   tUse rgb-value as the transparent pixel
        -###    Used pixel index as transparent (1 == first colormap entry)

For instance you could say 'giftool -B -i *.gif' to convert all your images to interlaced GIF files in one easy step.

So, if you want white to be transparent, use:
giftool -B -rgb ff,ff,ff




Netscape here seems to display somewhat chronic problems of either running wild or going to sleep , never to wake up.
When it runs wild , it tends to use up a whole CPU doing something useless , and often the user is not aware that anything has happened. This is disruptive to other users because the machine becomes bogged down with the rogue Netscape.
When it goes to sleep , it can be unkillable. This seems unlikely , but it is quite possible , and Netscape seems to have a flair for it.
Possible Causes and Potential Solutions:

As far as I can tell, most of this behavior is related to the cache that Netscape keeps.

I, personally, don't experience these runaways or sleepers and I do the following things for normal use:
- disable JavaScript
- disable Java
- set my netscape cache to be /tmp/bah6f

If you're consistantly plagued by runaways, try some of these steps.

All of these options are controlled in the Options menu item. Do NOT, however, simply set your cache to be /tmp. It MUST be something like /tmp/userid.

Putting your cache somewhere in /tmp keeps the cache on local disk (which saves fileserver space and gives you some minor performance increase) so you don't have the potential for multiple writers to the cache.

You may also benefit from cleaning your cache directory periodically: rm -rf ~/.netscape/cache

Some Workarounds:
- If a Netscape quits responding, you can use the command 'xkill' to kill it. Don't just iconify it and hope it goes away. To use xkill, simply type 'xkill' at a command prompt, then click in the window to kill. Be careful to click in the right window, because this program really does kill whatever you click on, no questions asked.

- If you find that you have some Netscape processes running, and you want to kill them, use 'top -U userid' (where "userid" is your userid) to find the PIDs (process IDs). Then do 'kill -9 PID' where PID is the process ID.

   e.g. here's some of the output of "top -Ubah6f":
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
372 bah6f 34 0 6272K 5516K sleep 11:10 0.83% 0.83% emacs19 353 bah6f 34 0 2464K 1764K sleep 0:11 0.09% 0.09% fvwm 428 bah6f 34 0 2800K 2136K sleep 0:02 0.08% 0.08% xterm 429 bah6f 33 0 1200K 1028K sleep 0:01 0.02% 0.02% bash 480 bah6f 34 0 11M 7916K sleep 0:56 0.00% 0.00% netscape3.0G

If I want to kill my Netscape, I do 'kill -9 480'

- If you can't kill a Netscape, (i.e. you've tried 'kill -9' and it won't go away), there is nothing that root can do about it. Really. If you can't kill it, neither can we.

The good thing is that they use very few resources if they go to sleep forever. They get swapped out to disk and use no CPU as far as I can tell. Unless you're creating dozens of them, they are nothing to worry about. They'll go away at the next reboot.

Any other questions or comments can be directed to root.



We have the following libraries/packages installed:

CGI File Font HTML HTTP LWP MD5 MIME Net Socket URI WWW



D. Mansouri writes:
Yes that's correct. I have updated the macro. Now we have to go back in each .raw file and change the calls to the macro_letter to indicate the extra argument.
If you want to be a perl guru, here's a hint. I don't know the whole syntax offhand, but this will get you started in the right direction. The following syntax will search and replace in every file listed on the command line:

perl -pi.bak -e 's/oldstring/newstring/g' file1 file2 file3 file4 ...

Before it does the substitution, it makes a copy of the original and names it file1.bak, file2.bak, etc. What you're going to want to do, though, is preserve part of the "oldstring," and this is the part I know perl can do, but I don't know how, offhand. You want to turn macro_letter(XXX) into macro_letter(XXX, BLUE). Perl can match the XXX with its $1 operator. Something like:

perl -pi.bak -e 's/macro_letter(\([^)]\))/macro_letter($1, BLUE)/g'

If you want a quick explanation of that syntax, ask. Some of it is right, but not all of it. You'll need a Perl book to get it 100% right. I recommend copying a small piece of the webman hierarchy to /tmp on some machine and playing with these commands there for a minute. When you finally figure out the syntax, be sure to save the command you ran into a file. You'll never remember otherwise. Look at /home/bah6f/src/CoolCommands for an example of what I mean.

The last thing I would suggest is how you get the list of file names on the command line. You want everything .raw, right? Here's a neat command:

find . -name \*.raw -print | xargs perl -pi.bak ...

The find command will find everything ending in .raw and print it out on stdout. The 'xargs' command takes everything from stdin (i.e. the list of files that the 'find' command just found) and puts them on the command line of the command that you give it.

A dumb but effective illustration of this is using 'find' and 'xargs' to do the effect of an 'ls -l' command:

find . -name \*.txt -print | xargs ls -l

This is a lot to digest. There's more where it came from.



philo juang writes:
Does anyone know how to uncompress a .gz file on the cs machines? gzip -d works just fine on watt.seas... Also, before i run into more problems, does tar -xvf work on cs machines as well?
gzip -d works on CS machines too. Did you try before you asked? If you don't have gzip in your path by default, then you've probably messed up your .profile or other shell config files.

Also, you'll want to add /usr/cs/contrib/bin to your path to get some more cool tools. For instance /usr/cs/contrib/bin/gtar is GNU tar. It is capable of uncompressing while untarring. Do the following:

gtar zxvf foo.tar.gz

Adding the z flag tells it to uncompress.

i've got fix-img-1.4 on www.people.virginia.edu/~pj7u/fix-img1.4/
Don't put them on watt and tell us to go look at them. Not all of us have watt accounts. Put them on the CS machines where you don't have to worry about psycho users rummaging through things that have loose permissions. Also, if you want the benefit of all the perl stuff I installed, and if you want to be able to look at the files from webman, you'll need to have it here anyways.

i have a couple other link checkers, filenames should be self explanatory -- what version of Sun OS do we have again?
We run Solaris 2.5.1, which is also called SunOS 5.5.1.

I downloaded one for OS 4.1.3 and Linux
Don't use SunOS 4.x binaries under Solaris. While they work most of the time, they don't work well most of the time. Also, it is bad to download binaries off the net. They might contain trojan horses or other security problems. Better to compile here ourselves.

they're all on chmod ugo+rwx so go ahead and do what you can...
No need to be so permissive. Do three things when you want to enable us to look at something:

newgrp webstaff
chgrp webstaff
chmod g+rw

That will put you in the webstaff group, giving you webstaff permissions (your default is 'ugrad'). Then, we can all look at the files by doing 'newgrp webstaff' and then looking at your files.

Don't i need find.pl to execute fiximg-1.4.pl?
I can't tell. Read fiximg-1.4.pl and see what it says. If it doesn't run right, figure out why.



Mark this bookmark!

http://www.mcp.com/sams/books/175-0/175-0.html

I can't imagine why, but SAMS publishing has put the entire text of many of their books online. This just can't help sell books. Why would I buy the book if I can get it on-line?

Anyways, that URL is the location of the book "Apache Server Survival Guide" Apache is the web server we use, so this book has a wealth of information about configuring, maintaining, and running the Apache server.

I tend to do most of the server configuration work. But if you've ever wondered "Can our web server do *that*?" now you can look it up in a pretty good book. If there are capabilities there or features that you want to enable, let me know. If you just want to know how our web server works, this is a good start.

While I'm on the subject of our web server, did you know that all the following web servers are really just our one web server?

http://www.cs.virginia.edu/
http://www.alice.virginia.edu/
http://www.legion.virginia.edu/
http://acm.cs.virginia.edu/

FYI, Paco



D. Mansouri writes:
Here is a question for you. Why can't I use the arrow keys when I'm in input mode in Vi?
This is a known problem with Solaris vi. they just didn't put in a high-performance vi. (as if there were such a thing)

Use emacs. End of problem. Look at http://www/~csadmin/software/emacs/ There's instructions on how to print a cheat sheet there. You'll never switch back, once you put emacs to work for you.

If you want a more function vi, if you must shackle yourself to such primitive technology, use "vim" it's a smarter vi.

Paco



James Babski writes:
I am writing you as one of the team members designing the VEF page and need access to the cgi-bin directory on the cs webserver on which the page is hosted in order to test a cgi alumni database I have been working on. Any help you could offer would be much appreciated.
Here's the way it works.

  1. Create a cgi-bin subdirectory in your public_html directory.
  2. Put whatever script you want to run there.
  3. Use "cgiwrap" in your web pages.

    Let's imagine that you have a script named "ALUM-SCRIPT" and you want to call it from a web page. Copy it to the cgi-bin subdirectory in your public_html directory. make sure it is readable and executable by everyone ("chmod a+rx ALUM-SCRIPT").

    The URL for that script is:
    http://www.cs.virginia.edu/cgi-bin/cgiwrap/YOUR USERID/ALUM-SCRIPT

    This way, you don't need special priveleges to run cgi programs.

    Let us know if you have difficulties with this. You might also want to type "man cgiwrap" at the command line to get more information on cgiwrap.

    Paco



Web Support writes:
unexpected end of line. This entry has been ignored.
A couple reminders:

1. No webman cron jobs should run on cobra. That's an interactive server. Our jobs belong on either a compute server (if they're compute intensive) or on www.cs.

2. You can't have blank lines in a crontab file. That's what this error means.

So, get the blank line out of the crontab file and move the cron job to www.

Paco



Philo Juang writes:
what's a cron job?
cron is a Unix daemon that runs all the time. You submit "jobs" to cron and it executes them at a specific time. This is how we do things like clearing out log files every week, running statistics every 20 minutes, etc.

cron jobs are controlled by a file called "crontab" which contains job specifications. Namely it contains a code for the day and time when you want jobs to run and the command you want to run at that time. Our crontab file has an error in it: a blank line.

Also, every single Unix machine runs it's own cron daemon, so if you're not careful you can have multiple crontab files scattered all over the network. If you have a job that needs to be executed on a big powerful machine, you put it in that machine's crontab. If you need it to run on a particular machine (like the web server) you put it in that machine's crontab.

We have one that's running on cobra. Cobra is an "interactive server." It's intended for people to do interactive work (like read mail, browse the web, edit files, etc). It's not intended to do computational work. Computational jobs compete with the interactive jobs, giving people poor interactive performance (mail runs slower, browsing the web is slower, etc).

Patrick has installed the following crontab entry on cobra:
# cron table for webman@cobra
# update the WebGlimpse index weekly, Sundays at 6:00am

00 06 * * 0     /home/webman/server/htdocs/search/wgreindex > /home/webman/server/htdocs/search/reindex.log 2> /home/webman/server/htdocs/search/reindex.log

It updates the search index for our search feature. However, it's running on cobra and it's doing a lot of work. That's why it's inappropriate. It should be moved to www.cs, where we already have a substantial number of web-specific cron jobs.

Patrick, since you're the culprit would you mind fixing this?

Paco



For all those who want to (or have to!) use C/C++ to write web-applications, I installed Tim Berner-Lee's W3.ORG C-library into /uns/lib, /uns/include and /uns/bin on the alphas. See

http://www.w3.org/pub/WWW/Library/

for information about using the library.

Personally - I'd prefer Perl to write anything that has to deal with the Web (ok, Tessa, maybe even Python :) ), but if you really have to use that nasty C stuff, this library should contain a lot of stuff that'd come handy...



Gabriel Robins writes:
I just created the following two EMail aliases files: ~webman/aliases/grads and ~webman/aliases/grads-alumni
Ok, this is all set. To test where the mail goes, here's what you do:

telnet mail.cs.virginia.edu 25
You should see something like:
220 CS ESMTP Mail Server ; Comments/Bugs to: root@cs.virginia.edu

Ask it to "expand" the alias. Type: expn grads

You should see something like:
250-AC Kit Chapin
250-Alice C. Lipscomb
250-Adam J Ferrari <|/users/ajf2j/bin/procmail_forward@archive.cs.Virginia.EDU>
250-Allison L. Powell

etc. etc. etc.

Type quit when you're done.

I'll CC webman on this.

Paco



Web Support writes:
/home/webmaster/server/htdocs/statbot/statbot.cron:
/uf21/webmaster/server/htdocs/statbot: permission denied
FYI, we should never hard code the partition name into anything. The statbot stuff has been around for a while, so this mistake probably wasn't done by anyone on our current web team. Anyways, it's worth mentioning since it's an easy mistake to make.

Webman was moved over the weekend. It now has a 2 gig disk all to itself. When it moved, it moved from /uf21 to /af19. The script had "uf21" encoded in one place, which didn't work after the move.

It's fixed. If you ever need to refer to webman's home directory, always refer to it as "/home/webman" or "/users/webman." Those are both symbolic links that will always work.

FYI, Paco



Philo Juang writes:
try typing "./make"....
That won't work unless make happens to be a binary in your current directory. It isn't. There's no magic to that command.

Look at webman's .profile. The first lines say:
# The following section is required to establish your environment
# on most ACC machines. It should not be modified or removed.

Someone has commented all that stuff out. No wonder things don't work. It says "should not be modified or removed" for a reason.

I'm going to uncomment that initialization section so that it executes, and then modify the PATH line that appears later in the file.

Make works fine for me as webman now. The default printer for webman is now 'bw' which is the big black & white printers in 233.

One more important caveat: Most (if not all) of you have watt.seas accounts. DO NOT wholesale copy over watt.seas settings to CS department accounts. It won't work. In particular, don't copy over the .variables.ksh file. You can certainly copy some things (like your favorite prompt, useful aliases, etc). Many system settings on watt, however, don't work here (e.g. PATH, LD_LIBRARY_PATH, MAIL etc).

Paco



Be careful with frame grabbing. We don't have anything which does MPEG-2 encoding (hardware MPEG-2 encoders cost the same as nice sportscars). All we have are software AVI encoders: they take a long time and their compression is so-so.

The 5-minute news article about the ATM switches was about 13 Mb using maximal compression. The Mac probably doesn't have enough disk to do more than 5 or 10 minute snippets. This is a cool idea, and it might be good to have 5 or 10-minute snippets of colloquium talks available. It's not feasible (neither in time nor space) to digitize a whole 45-minute or 1-hour talk.

The digitizer is in the VR lab, I can show you some of the basics any time.

Another concern: Any large files, (multi-megabyte files) should ideally be served by our FTP server. I can make space available there as necessary. The reason is that FTP is a better protocol than HTTP (HTTP was designed for small text documents) for transferring large amounts of data. Plus, using the FTP server will reduce the load on the web server for serving large documents.

Paco



D. Mansouri writes:
Is there anyway to get the cron daemon to NOT send us this message? I know these directories are unreadable.
I don't know where this is running. It should be running on the web server, but it's not in webman's cron file there. Why not?

Every time someone creates a cron job I have to remind them to put it on the web server. Well, the only way to do it is to get rid of all errors. Of course, if something important is failing, then we won't see the errors on that, either. Change your find command from:

find blah blah blah -print > found_files

to

find blah blah blah -print > found_files 2>/dev/null

That redirects file descriptor "2" (stderr) to /dev/null.

The reason the find2perl business didn't work for you is because you're using the wrong perl. The top of your file says: #!/uva/bin/perl

Change that to

#!/usr/cs/contrib/bin/perl

That will work better and allow you to use that perl snippet I sent you.

Paco



D. Mansouri writes:
I can only assume that the second location contains version 5. whereas the first one doesn't. Correct?
Yes. More importantly, it is perl5 with all the cool HTML, HTTP, MD5, Sockets, etc, etc, etc, modules that I've downloaded and compiled. /uva/bin/perl5 is also perl5, but that version won't find all the cool things I've added.

Paco



D. Mansouri writes:
I wrote a little Perl script that finds all the .html files that have been changed in the past 24 hours. It generates a report.
I looked at your script. I noticed that you call out to the system to run find. Well, that call can be prevented by using the following bit of perl. Put at the top of your code:

require "find.pl";

Put near the top of the code the following subroutine:

sub wanted {
    (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
    (int(-M _) < 7) &&
    /^.*\.html$/ &&
    print("$name\n");
}

Then, when you want to find all the files, issue the following call to the 'find' subroutine:

&find('find','server/htdocs','public_html');

Notice a few things:

  1. You can change what happens when it examines a file to see if it matches your expression. Right now, if it matches, it calls 'print("$name\n");' You might want to push that name onto a stack, which you work with later. I.e. replace that print statment with 'push(@filelist, $name);' Later on you can do 'foreach $modifiedFile (@filelist) ...'

  2. You can change what directories are searched by changing the call to find. In fact, you might want to specify them more fully: /home/webman/server/htdocs instead of 'server/htdocs' The way you have it now, the program won't execute correctly if your current directory is not /home/webman

    Now, am I such a perl guru that I can generate this totally opaque code in my spare time? No way. There is a program called "find2perl" which converts a valid 'find' command into the appropriate perl code. I just took the find command that you were running in the 'system()' command and ran it through 'find2perl'

    It makes your program a bit more flexible to not depend on calling out to other programs.

    Good job, by the way, on the program. It's output looks good.



3 steps to adding recursive make capabilities to a subdirectory where such capabilities did not exist previously.

1.  Add a SUBDIRS= line near the top of the Makefile.  E.g.:
    SUBDIRS=faculty grads undergrads
2.  Find the target line for the "all" target.  It usually looks like:
	all: ${HTMLFILES}
    Change it to:
	all: ${HTMLFILES} ${SUBDIRS}
    (This tells 'make' that the "all" target depends on all the
    subdirectories, too.)
3.  Copy the following rule from the main top Makefile
    (server/htdocs/Makefile) 

# This rule says that everything in ${SUBDIRS} depends on ${INCFILES}.
# To make the stuff in the subdirectory, it cd's to the directory and runs make there.

${SUBDIRS}: ${INCFILES}
	@echo "==> $@"
	@(cd $@; ${MAKE})

Put it anywhere near the bottom of the Makefile. Make sure it appears after the the "all" target. IMPORTANT: The two lines above which begin with an @ sign have TABs in front of them. These MUST be TABs. 8 spaces will not do.

That's it. This is now done in the SEAS directory. It makes all the html files in the centers subdirectory.

Paco



Just some tips and notes about Makefiles. I only mention them because I've seen these things a couple times now.

  1. Don't do this: TOPDIR=/home/webman/server/htdocs/

    Leave the trailing slash off the TOPDIR value:
    TOPDIR=/home/webman/server/htdocs

    So far this hasn't caused any trouble, but it could.

  2. Continue lines using the backslash at the end of the line:

RAWFILES=index.raw biomed.raw chemical.raw civil.raw cs.raw dean.raw\
	 ee.raw materials.raw mane.raw systems.raw tcc.raw facindex.raw\
	 facindex2.raw 

There is no need for them to be all on one line. It's harder to read that way.



Philo Juang writes:
I've been meaning to ask this Unix question for a long time: why do some files -- like ee.raw for example -- have the tilde after it? And what does it mean?
When someone edits a file with emacs, and they don't have any special .emacs file to disable this behavior, emacs creates a backup copy of the file named "file~". This is the file as it was before that session in emacs. They can be harmlessly deleted (and should be) when you're sure they're no longer needed. Most people disable them by putting the following into their .emacs file:

(setq make-backup-files nil)

Paco



Gabriel Robins writes:
How do I convert a PC file that I FTP to UNIX so that all the PC line feeds -- which appear as ^M -- become carriage-returns? Thanks - Gabe
dos2unix pcfile > unixfile

The reverse also works

unix2dos unixfile > dosfile

Paco



Jon Sweet writes:
Here's my script: #!/usr/local/bin/perl
There's really no such thing as /usr/local on CS machines. /usr/local is just a symbolic link to /usr/cs.

At any rate, the perl you want is /usr/cs/contrib/bin/perl (that's perl5). I have installed the CGI module for it already. You don't need to modify its include path to find it.

Lastly, if your script lives in /home/jds5s/public_html/cgi-bin (which it should...) then the URL is:

http://www.cs.virginia.edu/cgi-bin/cgiwrap/jds5s/SCRIPT

Make your first line:
#! /usr/cs/contrib/bin/perl

and just have

use CGI;

in there somewhere.

Paco



Philo Juang writes:
She wants to password protect her lecture notes so nobody can just download all the notes at the beginning of the semester. Anyone know how to password protect in html?
Quick tutorial on password protection. (Someone turn this into an FAQ, huh?)

  1. Put all the files you want to protect in the same directory, or under one top-level directory. e.g. public_html/secure/
  2. Create a world-readable file named .htaccess in that directory. Put the following entries in it:

        AuthType        Basic
        AuthName        SecureWebPage
        AuthUserFile    /home/bah6f/public_html/secure/.htpasswd
        AuthGroupFile   /home/bah6f/public_html/secure/.htgroup
        
        
          require user guest
        
    

    The example above is /home/bah6f/public_html/secure/.htaccess Note that you're using REAL paths, not paths as the web server sees them.

  3. The .htgroup file is pretty minimal. Just create one group, unless you're really interested in using groups.
  4. There is a program called "htpasswd" which will generate the .htpasswd file. You can run "htpasswd .htpasswd USER"
  5. That's it.

If you look at /home/bah6f/public_html/secure (which corresponds to http://www.cs/~bah6f/secure/) you'll find that it is password protected. User "guest" with password "gimme" can get to it.

http://www.apache.org/docs/mod/mod_auth.html will give you a good starting point for more info.



We now have a couple little utilities which will help Unix users who get that occasional BinHex file from a Mac user.

mcvert - Decodes BinHex 4.0 format (usually denoted .hqx)
unsea - Extracts SEA files (self-extracting archives)

Both programs will give you useful help if run with no command arguments.

If you have a file MacProgram.sea.hqx then 'mcvert' followed by 'unsea' should extract the files for you.



Joonsoo Kim writes:
I think the webmaster directory was changed since webglimpse was installed so the file "wgreindex" pointed to /a/athena-fo/...
I'll look at it some. Just an FYI on this sort of thing: Most programs which try to figure out your current directory, will come up with that /a/athena... path. While that might work one day, it doesn't necessarily always work, particularly if we move webman to a different disk (as happened a few months ago) or in some other, obscure cases. The bottom line is that if you use /home/webman (or /users/webman) it will *always* work no matter what. So, when you see a program picking a goofy path name, just do a search and replace (learn that emacs!) on it and replace /a/apollo-fo/af19 with /home.

I'm running the reindexing now to see how it does. It takes a while.

Paco



Gabriel Robins wrote:
Paco, can you please make ~seas/resdev/sponsors.html a UNIX symbolic link to http://www.cs.virginia.edu/research/sponsors.html ? Will this work better than the HTML-redirect currently in place?
I think I've done the best sort of redirect. This particular page is pretty important. I've seen how much mail it generates. So, I have a line in /home/webman/server/httpd_current/conf/srm.conf:

Redirect /~seas/resdev/sponsors.html http://www.cs.virginia.edu/research/sponsors.html

This redirects, at the server level. The best thing about it is that it works with all browsers, and the browser will display the new URL. This isn't as simple as a symbolic link, but it works better. The symbolic link makes the person think the file is still there. This kind of redirect makes it clear that the page has moved, but they get the new page.

You guys might want to add that bit above about the redirect to the growing pile of tidbits about web server administration.

In fact, you guys have gotten increasingly better at using Unix. Maybe one of you wants to sit down with me one afternoon, and have me show you some of the internals of the web server? I don't mind maintaining it, but if you know its capabilities, you can more fully utilize it.

Paco



Joonsoo Kim writes:
How do you tell what version of Unix the cs server is operating on?
uname is the general command. 'uname -a' tells you all the info. For instance:

SunOS atlas-fo.cs.Virginia.EDU 5.5.1 Generic sun4m 1 SUNW,SPARCstation-10

That tells you the machine atlas-fo.cs.virginia.edu (which is the real name of the web server) is running SunOS 5.5.1 (also known as Solaris 2.5.1 ugh).

The kernel name is "Generic" (meaning we haven't applied any patches or local modifications to the Unix kernel).

It is a sun4m architecture (which is rarely useful to know).

Sun's official name for this machine model is SUNW,SPARCstation-10.

There are various switches you can give to the uname command to get it to tell you subsets of that information. For instance, uname -n tells you the machine name. uname -s returns the name of the OS ("SunOS") and uname -r returns the revision number on the OS ("5.5.1").

Also, I need to install a new version of glimpse, do ask the admin to replace the existing version or install it to /usr/local/contrib? Thanks
A note: don't use /usr/local to refer to /usr/cs -- /usr/local is just a symbolic link to /usr/cs.

Are you sure you need a new version? The Glimpse web page at http://glimpse.cs.arizona.edu/ says the latest version is 4.0B1. glimpse -V says:
This is glimpse version 4.0B1, 1996.

I don't think there is a newer version.

Perhaps you mean you want to install a new version of webglimpse? That's installed locally in
/home/webman/server/htdocs/search2/webglimpse-1.1

You can just install in a parallel subdirectory. No need to put it in contrib.

Cool?

Paco



I don't know why, but now when I type make in the top level directory, it wants to make the whole doggone web tree. If someone is making changes in the macros file, be very careful. When you are done changing the macros, and you are sure they are correct, be sure to make all the pages so that we don't have pages with different headers/footers or whatnot.
Also, down in misc, I got the following error from make:
"news-wulf-iuva.raw", line 3: macro_news_header: argument mismatch
That's because the macro_news_header expects 3 arguments: Title, Source, Author. I moved the line "By Charlotte Crystal" into the macro_news_header line, and it looks better. Let's be careful how we use the macros. The whole web tree can be designed such that it will make with no warnings or errors.

Another caveat: always use double quotes on URLs in links:

Not this: University of Illinois

But this: University of Illinois

That way, if there are spaces or unusual characters in the URL, they won't confuse the browser.

FYI, Paco



Golden nugget:

The steps for creating a GIF with transparency using our CS tools.

  1. Start with a raw image. Preferrably tiff (which is lossless).
  2. Try to remove any scanning artifacts (if you're working with scanned image). Or try to remove digital photography artifacts, if that's what it is.

    What do I mean by artifact?

    - non-solid colors in places where the color should be solid. (e.g. if you scanned a business card, the white of the business card might come out kind of mottled) - There may be flash points (places where a camera flash is really bright) to take out as well.

    Images that were originally rendered digitally (e.g. in PhotoShop or xpaint) won't have these artifacts.

  3. If you're concerned about size, try reducing the color depth or the number of colors. In 'xv' there is a color map editor where you can change the RGB values of any color in the color map. That's what I sometimes use.
  4. Using your favorite image editor, (I use 'xpaint') flood-fill any areas that you want to make transparent. Fill them with, preferrably, a primary color. You can only have one colormap entry be transparent. So you basically choose which color you want to represent transparent. I often use bright yellow or green.
  5. This has to be done on the 1 side, but it can be done from the command line. Use the 'giftool' program to set the transparent color. 'giftool -h' will tell you about it. Here's the command line to produce a file named 'logo-trans.gif' from 'logo.gif' where we set all the yellow pixels to transparent:

    giftool -o logo-trans.gif -rgb "Yellow" logo.gif

That's it.

Paco



John Walter Baxton Jr wrote:
Can you tell me the default window size for a browser?
Something worth noting: it is possible to design a web page that works properly regardless (almost) of the browser size.

The bottom end resolution that you have to consider is probably 640x480. Anyone who works at that resolution probably maximizes their browser. Remember, though, that they are free to choose whatever font they like.

If you're working on a high-res monitor, you might try choosing goofy default font sizes in your browser some times just to get a feel for what weird things can happen. Try a really big one and a really small one. While that doesn't simulate changing resolution (because the images will remain the same effective size) it does show you how things can change, and it will help you figure out how your layout is rearranged to accomodate the different text sizes.

Just a suggestion, Paco



Joonsoo Kim writes:
We changed the macros file in the include directory so that it included:
macro_header.html
macro_footer.html
macro_buttons.html
index_buttons.html
Sounds like Gabe fixed it, but I'll explain what you guys seem to misunderstand. Here's a quick anatomy of things you find in a Makefile:

#
# comments.
VARIABLE=value
target: dependency1 dependency2
        command for that target
        another command for that target
dependency2: dependency3 dependency4
        separate command for that target
        another command for that target.

A VARIABLE=value line just sets that variables value for convenient reference later.

The first line that has a colon in it is the first target in the makefile. By default, if you just type "make" that's the target it will try to make. All other targets are also "makeable." For instance, you can type "make index.html" and it will only make that one file.

A target has two things associated with it:

- dependencies
- commands

dependencies are file names. Make looks at the date on the dependency file, and on the target file. If the dependency file is newer, it performs the commands associated with that target.

commands are just Unix commands. They can do anything you like.

In my skeletal example above, the target "target" depends on "dependency1" and "dependency2". If I say "make target" it will check on the two dependencies. It might have to make "dependency2" before it makes "target". It will use file timestamps to figure that out.

Here's the part that you guys missed: Just because a target is dependent on "X" doesn't mean that "X" is necessarily listed in the commands or somehow involved in compilation. For instance, I can make a target which is dependent on a non-existant file named "foo". Make will exit with an error saying "don't know how to make foo."

When you work with Makefile targets, you have to do 2 things, which are mostly independent of each other.
1. Make sure you list the correct dependencies for the target.
2. Make sure the commands which make a particular target do the right thing.

Nothing automatically enforces this. Because 'make' is such a general purpose tool, there's no way to automatically check these sorts of things. So it is possible to depend on something that you don't use during compilation.

What if we have one particular HTML file which depends on a bunch of goofy stuff that no other HTML file depends on? Take it out of the HTMLFILES list, and create a separate target just for it:

goofy.html: ${INCFILES} goofy.txt
        @rm -f $@
        ${CPP} -I${INCDIR} -P goofy.raw > goofy.html
        some other command
        another command

Hope that helps. There's an O'Reilly book on Make and Makefiles. You may want to have that around to help. There's also a man page on 'make' which tells you very specifically what it does.

Paco



Jason O. Watson writes:
Can you offer any suggestions as to what we should/can do about this?
Recursively altering the web pages is not that hard actually. It just requires stringing lots of commands together into something sensible. Here's a very long description of what's involved.

I have a one-line perl command that will find all instances of one thing and replace them with another thing.

/usr/cs/contrib/bin/perl -pi.BaK -e 's!OLDSTRING!NEWSTRING!g'

For instance, say I wanted to change all instances of "http://www.cs.virginia.edu/" to "macro_CS_URL"

I'd do the following:

perl -pi.BaK -e 's!http://www.cs.virginia.edu/!macro_CS_URL!g' file1 file2...

This does 2 things:

- It performs the global substitution on every file listed on the command line.
- It creates a backup file named file1.BaK before it does the substitutions.

So if I listed "file1" and "file2" on the command line, I'd have file1, file2, file1.BaK and file2.BaK.

Once I verified that the substitutions did what I wanted, I'd remove all the *.BaK files. (more on that later)

So, the only thing left to do is figure out "how do I get the names of all the .raw files and feed them to this perl command?"

'find' will give you the file names. The command

find /home/bah6f/public_html/funnies -name \*.raw

will find all the raw files and print out a list like this:

/home/bah6f/public_html/funnies/gradParables.raw
/home/bah6f/public_html/funnies/stateMottos.raw
/home/bah6f/public_html/funnies/newMovies.raw
/home/bah6f/public_html/funnies/dumbCrooks.raw
/home/bah6f/public_html/funnies/hellPhysics.raw

Now, feed them to the perl script. We can do this one of 2 ways.

1. Save the list to a file, and then stick it on the command line
2. Pipe the output of find onto the perl command line.

Method 1:
find /home/bah6f/public_html/funnies -name \*.raw > RawFileList

Creates a file named RawFileList with all the files in it. Now 'cat' that onto the command line:

perl -pi.BaK -e 's!OLD!NEW!g' `cat RawFileList`

Note the two different quotes and how they're used. The string `cat RawFileList` is replaced by the contents of the file. The actual command executed is:

perl -pi.BaK -e 's!OLD!NEW!g'
/home/bah6f/public_html/funnies/gradParables.raw
/home/bah6f/public_html/funnies/stateMottos.raw

Method 2: dispense with the intermediate file (RawFileList)

find blah blah blah | xargs perl -pi.BaK blah blah blah

'xargs' takes the output of 'find' and puts on the command line of 'perl' Exact same result, but without a temporary file. We use a pipe instead.

This shows us how to remove all the .BaK files once we're sure we don't want them:

find /home/webman/server/htdocs -name \*.BaK -print | xargs /bin/rm -f

All this is really dangerous. Once you let one of these commands loose, you will recurse the entire webman hierarchy doing something to every file. I'm working on combining this into a perl script that will do some sanity checking and allow you to double check what it's about to do.

If you want to try these commands out, I recommend you create a little sandbox in /usr/tmp on some machine and play there until you fully understand how it works.

Paco



Jason Watson writes:
d:\jason\gifs\ on the webman computer and i am now in the process of converting them to JPG format.
This can be done batch-style on the Unix side. The command 'cjpeg' compresses images to jpeg. It also has a '-progressive' flag to make a jpeg a progressive jpeg. You should look at the man page for 'cjpeg' it might be useful. If you want to turn a jpeg into something else (like a GIF or ppm or something) there is a corresponding 'djpeg' command. Here's a 'for' loop for your command line, or it can be written into a script.

#--------------------------------------------------
for GIF in *.gif
do
  # this line takes a file name like "foo.gif" and strips of the .gif.
  # the variable $BASE will then be "foo"
  BASE=`basename $GIF .gif`
  # perform the conversion
  cjpeg $GIF > $BASE.jpg
  echo "$GIF ... $BASE.jpg"
done
#--------------------------------------------------

How's that?

Also, bear in mind the following caveat (from the cjpeg man page)

Color GIF files are not the ideal input for JPEG; JPEG is really intended for compressing full-color (24-bit) images. In particular, don't try to convert cartoons, line drawings, and other images that have only a few distinct colors. GIF works great on these, JPEG does not. If you want to convert a GIF to JPEG, you should experiment with cjpeg's -quality and -smooth options to get a satisfactory conversion. - smooth 10 or so is often helpful.

Paco



Jason O. Watson wrote:
i have found a collection of perl scripts that are useful for html applications at: http://www.oac.uci.edu:80/indiv/ehood/perlWWW/
FYI, my bookmarks are available from my home page. (or from http://www.cs.virginia.edu/cgi-bin/cgiwrap/bah6f/bookmarks)

Near the top are several good web-server and perl-related links.

Paco



Jason O. Watson writes:
I've placed comments at the bottom of the: /home/webman/server/apache_1.0.3/conf/srm.conf to reflect what I think *should* be added in order to make this work. I just want to be sure. Is what I have right? If so, do I need to run a make of some sort, or will it automatically take care of things on its own?
I uncommented them and restarted the server. They seem to be working well. To restart the server:

  1. Login on www.cs
  2. cd /home/webman/server/httpd_current/logs
  3. run "./restart"
  4. wait a few seconds, then run "tail error_log" You should see:

SIGHUP received. Attempting to restart Server configured -- resuming normal operations

(the "tail" command shows you just the bottom 10 lines of a file. Likewise the "head" command shows you the top 10 lines of a file)

I just can't seem to get it to work. It is called fiximg.pl and I have placed it into the /home/webman/perlbin/ directory. Is there something that I am doing wrong with this one?
I'll try to take a look at it today.

Paco



Jason O. Watson writes:
One major question: I know that it is possible to determine whether or not a person is from a specific domain -- i.e. virginia.edu -- and go ahead and restrict access from outside that domain, *but* is it possible to determine the domain, and based upon that, automatically load either a public version of the site or if from the approved domain, load a "restricted" version???
Yes, but the only way I know how to do it is via CGI. If your index file is not "index.html" but instead is "index.cgi" then it is executed as a CGI script. In that case, you can have a shell script like this:

#! /bin/sh
#
# check domain and return pages
echo "content-type: text/html"
echo ""
case $REMOTE_HOST in
        *virginia.edu*|128.143.*)
          cat virginia.html
          ;;
        *)
          cat restricted.html
          ;;
esac

You can do more sophisticated things too, based on browser type, if you want. Once you do this with your index page, you have a wealth of possibilities.

Paco



Jason O. Watson wrote:
is gd 1.2 already installed on the server somewhere? if not let me know and i'll go ahead and do it -- it's needed for that new stat program.
Yep. the library is /usr/cs/contrib/lib/libgd.a and the include file is in /usr/cs/contrib/include. You'll need to add "-I/usr/cs/contrib/include" and "-L/usr/cs/contrib/lib" to the Makefile, if it doesn't give you some easier way to add those parameters to the build process.

Paco



Philo Juang writes:
i think i've learned enough of cgi-bin to do a little bit of programming -- not perl, though, in a mix of C and C++. i'll dig around the usr directories to see if we've got CGIc and ACGI libraries to use...
FYI, take a look at /home/bah6f/public_html/cgi-bin/test-form.

(URL: http://www.cs.virginia.edu/cgi-bin/cgiwrap/bah6f/test-form)

That's a very simple test form in perl, and I can give you some reference material on how to do more. It's simple and powerful.

Paco



Jason O. Watson writes:
what is the status of the post script files and their conversion into html?
If that's your ultimate goal, you should get some other source form. Postscript is almost impossible to convert to html.

One of her papers was a LaTeX paper. There is a utility called "latex2html" which can convert that to a nice set of pages.

The others were frame files. If she gives you the frame file, you can use framemaker to convert it to HTML.

That should save you lots of time and effort.

Paco



Someone put

stty dec

In the .bashrc for webman. While that probably makes your life easier on some telnet program you use, it royally messes up those of us using xterms. Imagining that a webman named "fred" likes this setting, perhaps we can put something like the following in the .bashrc

function fred ()
{
        # set things the way fred likes them
        EDITOR=emacs
        PRINTER=cs-bw1
        PAGER=less
        export EDITOR PRINTER PAGER
        stty dec
}

That way, when Fred logs in, he can just type "fred" and get things set the way he likes them. I'll create a function "paco()" so that when I login, I'll type "paco" and get things set the way I like them.

Sound reasonable? This way we don't interfere with each other's work styles by modifying a global property.

Paco



I notice that someone put in: MANPATH=/usr/local/man:/usr/man

Be careful what you put in here. You are already supplied with a

MANPATH:
: casper:~ ; echo $MANPATH
/usr/man:/usr/cs/man:/usr/dt/man:/opt/hpnp/man:/X11.6/man:/uva/man:/usr/cs/contr
ib/man:/opt/SUNWspro/man:/opt/man:/contrib/man

By overriding it, you effectively cripple the 'man' command for yourselves.

I've removed that line, since it did no good.

Paco



Jason Watson wrote:
I have fixed this. I created a file called error403.html. I made the needed change to srm.conf. Is there something I should do now so that the server will recognize this change or will it automatically happen on the weekly reset?
You can do either or both. Either login on www and run
/home/webman/server/httpd_current/logs/reset

or wait until saturday.

If you can't login on www as webman, you can do the following from cobra:
rsh www /home/webman/server/httpd_current/logs/reset

Check the last part of the error log immediatly after that. (use "tail error_log") If it restarts correctly, you'll see a message to that effect. You should see two messages. One saying something like "Got SIGHUP. Attempting to restart." and then you should see one like "Server configured. Resuming normal operations."

If you don't see anything after the "attempting to restart" then the server died and there was an error so that it couldn't restart. In that case, you're stuck. Only root can restart the web server, so you have to contact one of us.

Needless to say, if you're going to restart the web server, it's good to do it when you know one of us is around to help out if something goes wrong.



Jason Watson writes:
how can i access idraw from my Unix acct in exceed?
The only way you can still use idraw -- an ancient Unix drawing program -- is to login on one of our old SunOS machines, larc or stretch. Then, on larc or stretch, set your display environment variable to be the name of the X display you're sitting at (for instance, opal in room 233): export DISPLAY=opal:0 . Then you can run idraw.


Jason O. Watson writes:
Just a little quirk that i came across tonight: if you ever have the need to type the word "unix" and then run it through the pre-processor, it will come out as a "1".
It's because the macro "unix" is defined to be "1". This is so you can do things like:

"#ifdef unix"
"do something"
"#else"
"do something else"
"#endif"

That's more important in C programming (where cpp was developed) than it is in HTML processing.

I was just reading the man page on cpp (which you might want to do sometime). There are a couple options which might interest us:

-undef -----> Remove initial definitions for all prede- fined symbols.

This will undefine "unix" and anything else that is defined by default. You might also find things like SunOS and _Solaris_ are defined by default.

Note also: -R -----> Allow recursive macros.

Didn't we want to use a recursive macro at some point? Something like:

SEAS_faculty(blue_letter(t))

The -R will allow that.

If any of these options are useful to us, you can probably figure out how to put them into the make file.



Jason O. Watson writes:
I am finished with taking pictures with the camera. what is the best way to transfer the pics that are on the laptop onto the network? is there a special card or something for a modem/ethernet somewhere?
Yep. It's in 233, right next to the webman PC. It's a box we call the "GatorBox." You'll find a cable attached to the GatorBox that plugs into the printer/modem port on the laptop. Once you've done that, you can enable AppleTalk on the laptop. Then, you should be able to use the chooser on the laptop to select apollo (or athena, depending on what home directory you are moving images to). For webman you want apollo /af19.

Once you choose apollo /af19, you will be able to drag the images over to the Unix filesystem directly. It's slow (about 100 Kbaud), but it'll work.

If you haven't already used PhotoFlash to save the images as TIFF, I would do that before transferring them to the Unix side. Neither Unix nor Windows programs understand the QuickTake's native format.

Also, when the mac mounts the Unix directory silly things will happen. If the mac creates files in a Unix directory (which it will do if you drag from the mac to the Unix directory) it will create two files. One is called the "resource fork" and the other is the "data fork." Under Unix or windows, all we care about are the "data forks." Once it's all done, you'll see files like this in your directory:

Image1.tiff %Image1.tiff Image2.tiff %Image2.tiff 

You can safely remove the ones that begin with % signs.

Paco



How to get a cumulative access counter for your page
First, you must name the file with ".shtml" instead of .html. In the future, I'm going to change this. But for now, that's how it is.

Secondly, put the following in the html file you want to count:

That text has to appear exactly as you see above. Capitalization, spacing, and punctuation must be identical.

When your web page is sent, that string will be replaced with a number. The number is the number of accesses to that URL. If you have symbolic links (e.g. HOMEWORK.HTML points to homework.html) the web server will consider these to be two different pages, and they will be tracked differently.

Q: Can I get this number without loading the page?
A: Yes. Look in /home/webman/server/counter. The file is named according to the URL. You should be able to find the file.

Q: Can I artificially bump the counter?
A: Sure. Edit the file in /home/webman/server/counter that corresponds to the page. If you have trouble editing the file because of file permissions, you'll have to send mail to root.

Q: Does the counter track /~userid/index.html differently from /~userid/ ?
A: No. The counter knows that those two files are the same. Other than that, however, the counter knows nothing.



I got your message about problems with the LWP module in perl.

I did 2 things to fix it.

1. I reinstalled the LWP module so that it was current with our version of perl.

2. I changed this line:

use LWP::Simple;
to this:
use LWP qw( Simple );

That doesn't produce any errors. See if it does what you want it to.

For testing purposes, you can run your script at the command line. It will say: (offline mode: enter name=value pairs on standard input)

You can enter parameters to simulate form input by doing things like this:
RADIO1=1

(that would make it appear that the radio button named "RADIO1" was checked)

Press ctrl-d when you've entered all your parameters (or press ctrl-d with no parameters to simulate the first time someone calls your script).



Apparently, whoever was on duty didn't know how to do this. This is something you should know. There are two ways to provide this service to him. The simplest way (but not the preferred way) is to go into /home/webman/server/htdocs and just create symbolic links.

e.g. "ln -s /home/cs340/public_html CS340"

The better way is to edit /home/webman/server/htdocs/.htaccess (make sure you are careful. Screwing up that file can render the whole web server inoperative)

Put "redirect" lines into that file. Mike wants /cs340 and /CS340 to go to /~cs340/. Put the following lines in that file:

Redirect permanent /cs340 http://www.cs.virginia.edu/~cs340/
Redirect permanent /CS340 http://www.cs.virginia.edu/~cs340/

These send a special redirect message to a user's web browser which tells it "The page isn't here, go HERE instead". That way, if a user types the wrong thing, they get the right thing. A symbolic link, on the other hand, makes the wrong thing work. I.e. someone types in /CS340 and they get a page back. They will bookmark the wrong URL. It will work, but it means that people will be bookmarking various different URLs.

For more information on this topic, see: http://www.rge.com/pub/infosystems/apache/docs/mod/ Look under "mod_alias"

That URL is the definitive guide to our web server. You'll be amazed at what it can do.

Paco



I have finished installing webglimpse 1.5, with some significant modifications, in the webman account. This fixes a large security hole and a couple of other bugs from the previous version.

I also fixed some problems that were introduced when someone on the web team moved important things from search/ to search2/. There are now two directories, as follows:

search: contains HTML, et al, that is visible to the world
search_data: contains hash tables, index files, and logs

Search2, ~/webglimpse-1.1, and ~/webglimpse-old can all be removed as you see fit.

The indexing script, wgreindex, used to run weekly but was removed from the webman cron table in June, and thus the index went about five months without updates. I just ran the indexer by hand, but I would highly recommend that someone put it back in cron so that it happens automatically, maybe weekly. For example:

8 3 * * 0 /home/webman/server/htdocs/search_data/working/wgreindex
>& \
/dev/null

This would do a reindex every Sunday at 03:08AM. The indexing process takes around 4 hours, though only the final 10-15 minutes of that is actual CPU time. It can run on any Sparc+Solaris machine; cobra will do nicely.

I've documented the major changes that I had to make to the standard version of webglimpse to make it run properly. Any future maintainer should probably read what I've done... ~/bin/webglimpse/CHANGES-PATRICK.

Documentation is available from http://glimpse.cs.arizona.edu/, or e-mail me, patrickr@virginia.edu.

Happy searching.

--Patrick



Jason O. Watson writes:
I have added the following line to the cron file on the web server:

8 3 * * 0 /home/webman/server/htdocs/search_data/working/wgreindex & /dev/null

I don't think that's going to work.

">&" is a shortcut that many advanced shells (bash, tcsh, et. al.) handle. It's a shorthand way of writing ">/dev/null 2>&1"

Unfortunately, cron executes scripts with the most basic of shells, sh. sh doesn't understand ">&". You'll need to make that:

8 3 * * 0 /home/webman/server/htdocs/search_data/working/wgreindex > /dev/null 2>&1

Remember, also, no blank lines in crontab files. A dorky solaris limitation.



Sang Hyuk Son writes:
Do we already have acrobat distiller in our system? I don't want to make multiple copies of big programs. If we have it, tell me how to use it.
You should be able to use the command "ps2pdf".

If you have a file named slides.ps and you want to create a PDF file called slides.pdf, do this:

ps2pdf slides.ps slides.pdf

That should work for most anything you want to do.

Paco



UVa CS Department of Computer Science
School of Engineering, University of Virginia
151 Engineer's Way, P.O. Box 400740
Charlottesville, Virginia 22904-4740

(434) 982-2200  Fax: (434) 982-2214
Web Comments: webteam@cs.virginia.edu
Admissions Inquiries: inquiry@cs.virginia.edu
Site directory, Other addresses
Server statistics
© Created by the CS Web Team