Scroll to navigation

sitecopy(1) User Manuals sitecopy(1)

NAME

sitecopy - maintain remote copies of web sites

SYNOPSIS

sitecopy [options] [operation mode]sitename...

DESCRIPTION

sitecopyis for copying locally stored web sites to remote web servers. Asingle command will upload files to the server which have changedlocally, and delete files from the server which have been removedlocally, to keep the remote site synchronized with the local site.The aim is to remove the hassle of uploading and deleting individualfiles using an FTP client. sitecopy will also optionally try to spotfiles you move locally, and move them remotely.

FTP, SFTP, WebDAV and other HTTP-based authoring servers (for instance,AOLserver and Netscape Enterprise) are supported.

GETTINGSTARTED

This section covers how to start maintaining a web site usingsitecopy. After introducing the basics, two situations are covered:first, where you have already upload the site to the remote server;second, where you haven't. Lastly, normal site maintenance activitiesare explained.

IntroducingtheBasics

If you have not already done so, you need to create an rcfile, whichwill store information about the sites you wish to administer. You alsoneed to create a storage directory, which sitecopy uses to record thestate of the files on each of the remote sites. The rcfile and storagedirectory must both be accessible only by you - sitecopy will not runotherwise. To create the storage directory with the correctpermissions, use the command
mkdir -m 700 .sitecopy
from your home directory. To create the rcfile, use the commands
touch .sitecopyrc
chmod 600 .sitecopyrc
from your home directory. Once this is done, edit the rcfile to enteryour site details as shown in the CONFIGURATION section.

ExistingRemoteSite

If you have already uploaded the site to the remote server, ensureyour local files are synchronized with the remote files. Then, run
sitecopy --catchup sitename
where sitename is the name of the site you used after thesitekeyword in the rcfile.

If you do not have a local copy of the remote site, then you can usefetch modeto discover what is on the remote site, andsynchronize modeto download it. Fetch mode works well for WebDAV servers, and mightwork if you're lucky for FTP servers. Run
sitecopy --fetch sitename
to fetch the site - if this succeeds, then run
sitecopy --synch sitename
to download a local copy. Do NOT do this if you already have alocal copy of your site.

NewRemoteSite

Ensure that the root directory of the site has been created on theserver by the server administrator. Run
sitecopy --init sitename
where sitename is the name of the site you used after thesitekeyword in the rcfile.

SiteMaintenance

After setting up the site as given in one of the two above sections,you can now start editing your local files as normal. When you havefinished a set of changes, and you want to update the remote copy ofthe site, run:
sitecopy --update sitename
and all the changed files will be uploaded to the server. Any filesyou delete locally will be deleted remotely too, unless thenodeleteoption is specified in the rcfile. If you move any files betweendirectories, the remote files will be deleted from the server thenuploaded again unless you specify thecheckmovedoption in the rcfile.

At any time, if you wish to see what changes you have made to thelocal site since the last update, you can run
sitecopy sitename
which will display the list of differences.

SynchronizationProblems

In some circumstances, the actual files which make up the remote sitewill be different from what sitecopythinksis on the remote site. This can happen, for instance, if theconnection to the server is broken during an update. When thissituation arises,Fetch Modeshould be used to fetch the list of files making up the site from theremote server.

INVOCATION

In normal operation, specify asingleoperation mode, followed by any options you choose, then one or moresite names. For instance,
sitecopy --update --quiet mainsite anothersite
will quietly update the sites named 'mainsite' and 'anothersite'.

OPERATIONMODES

List Mode- produces a listing of all the differences between thelocal files and the remote copy for the specified sites.
Flat list Mode- like list mode, except the output produced is suitable forparsing by an external script or program. An AWK script,changes.awk.is provided which produces an HTML page from this mode.
Update Mode- updates the remote copy of the specified sites.
Fetch Mode- fetches the list of files from the remote server. Note that thismode has only limited support in FTP - the server must accept theMDTMcommand, and use a Unix-style 'ls' for LIST implementation.
Synchronize Mode- updates thelocalsite from the remote copy.WARNING:This mode overwrites local files. Use with care.
Initialization Mode- initializes the sites specified - making sitecopy think there are NOfiles on the remote server.
Catchup Mode- makes sitecopy think the local site is exactly the same as theremote copy.
View Mode- displays all the site definitions from the rcfile.

Verify stored state of site matches real remote state
Display help information.
Display version information.

OPTIONS


Append debugging messages to FILE (else use stderr)

Create root for remote site

Display but do not carry out the operationApplicable inUpdate Modeonly, will prompt the user for confirmation for each update(i.e., creating a directory, uploading a file etc.).
Specify an alternate run control file location.
Specify an alternate location to use for the remote site storage directory.
Quiet output - display the filename only for each update performed.
Very quiet output - display nothing for each update performed.
Applicable inUpdate Modeonly, displays the progress (percentage complete) of data transfer.
Keep going past errors inUpdate ModeorSynch Mode

Perform the given operation on all sites - applicable for allmodes exceptView Mode,for which it has no effect.
Turns on debugging. A list of comma-separated keywords shouldbe given. Each keyword may be one of:
socket Socket handling
files File handling
rcfile rcfile parser
http HTTP driver
httpbody Display response bodies in HTTP
ftp FTP driver
sftp SFTP driver
xml XML parsing information
xmlparse Low-level XML parsing information
httpauth HTTP authentication information
cleartext Display passwords in plain text

Passwords will be obscured in the debug output unlessthe cleartext keyword is used. An example use of debuggingis to debug FTP fetch mode:

sitecopy --debug=ftp,socket --fetch sitename

CONCEPTS

Thestored stateof a site is the snapshot of the state of the site saved into thestorage directory (~/.sitecopy/). Thestorage fileis used to record this state between invocations. In update mode,sitecopy builds up afiles listfor each site by scanning the local directory, reading in the storedstate, and comparing the two - determining which files have changed,which have moved, and so on.

CONFIGURATION

Configuration is performed via the run control file (rcfile). Thisfile contains a set of site definitions. A unique name is assigned toevery site definition, which is used on the command line to refer tothe site.

Each site definition contains the details of the server the site isstored on, how the site may be accessed at that server, where the siteis held locally and remotely, and any other options for the site.

SiteDefinition

A site definition is made up of a series of lines:

site sitename
server server-name
remote remote-root-directory
local local-root-directory

[port port-number ]
[username username ]
[password password ]
[proxy-server proxy-name
proxy-port port-number ]
[url siteURL ]
[protocol { ftp | sftp | webdav } ]
[ftp nopasv ]
[ftp showquit ]
[ftp { usecwd | nousecwd } ]
[http expect ]
[http secure ]
[safe ]
[state { checksum | timesize } ]
[permissions { ignore | exec | all | dir } ]
[symlinks { ignore | follow | maintain } ]
[nodelete ]
[nooverwrite ]
[checkmoved [renames] ]
[tempupload ]
[exclude pattern ]...
[ignore pattern ]...
[ascii pattern ]...

Anything after a hash (#) in a line is ignored as a comment.Values may be quoted and characters may be backslash-escaped.For example, to use theexcludepattern *#, use the following line:
exclude "*#"

RemoteServerOptions

Theserverkey is used to specify the remote server the site is stored on.This may be either a DNS name or IP address. A connection is madeto the default port for the protocol used, or that given by theportkey.sitecopy supports the WebDAV or (S)FTP protocols - theprotocolkey specifies which to use, taking the value of eitherwebdavorftp/sftprespectively. By default, FTP will be used.

Theproxy-serverandproxy-portkeys may be used to specify a proxy server to use. Proxy serversare currently only supported for WebDAV.

If the FTP server does not support passive (PASV) mode, thenthe keyftp nopasvshould be used. To display the message returned by the serveron closing the connection, use theftp showquitoption.If the server only supports uploading files in the currentworking directory, use the keyftp usecwd(possible symptom: "overwrite permission denied").Note that the remote-directory (keywordremote) must be an absolute path (starting with '/'), orusecwdwill be ignored.

If the WebDAV server correctly supports the 100-continueexpectation, e.g. Apache 1.3.9 and later, the keyhttp expectshould be used. Doing so can save some bandwidth and time in anupdate.

If the WebDAV server supports access via SSL, the keyhttp securecan be used. Doing so will cause the transfers between sitecopyand the host to be performed using an secure, encrypted link. Thefirst time SSL is used to access the server, the user will beprompted to verify the SSL certificate, if it's not signed bya CA trusted in the system's CA root bundle.

To authenticate the user with the server, theusernameandpasswordkeys are used. If it exists, the~/.netrcwill be searched for a password if one is not specified. Seeftp(1) for the syntax of this file.

Basic and digest authentication are supported forWebDAV. Note that basic authentication must not be used unless theconnection is known to be secure.

The full URL that is used to access the site can optionally bespecified in theurlkey. This is used only in flat list mode, so the site URLcan be inserted in 'Recent Changes' pages. The URL mustnothave a trailing slash; a valid example is
url http://www.site.com/mysite

If thetempuploadoption is given, new or changed files are upload with a ".in." prefix,then moved to the true filename when the upload is complete.

FileState

File state is stored in the storage files (~/.sitecopy/*), and is usedto discover when a file has been changed. Two methods are supported,and can be selected using thestateoption, with either parameter:timesize(the default), andchecksum.

timesizeuses the last-modification date and the size of files to detect whenthey have changed.checksumuses an MD5 checksum to detect any changes to the file contents.

Note that MD5 checksumming involves reading in the entire file, andis slower than simply using the last-modification date and size. Itmay be useful for instance if a versioning system is in use whichupdates the last-modification date on a 'checkout', but this doesn'tactually change the file contents.

SafeMode

Safe Modeis enabled by using thesafekey. When enabled, each time a file is uploaded to the server,the modification time of the fileas on the serveris recorded. Subsequently, whenever this file has been changed locallyand is to be uploaded again, the current modification time of the fileon the server is retrieved, and compared with the stored value. Ifthese differ, then the remote copy of the file has been altered by aforeign party. A warning message is issued, and your local copy ofthe file will not be uploaded over it, to prevent losing any changes.

Safe Mode can be used with FTP or WebDAV servers, but if Apache/mod_davis used, mod_dav 0.9.11 or later is required.

NoteSafe mode cannot be used in conjunction with thenooverwriteoption (see below).

FileStorageLocations

Theremotekey specifies the root directory of the remote copy of the site.It may be in the form of an absolute pathname, e.g.
remote /www/mysite/
For FTP, the directory may also be specified relative to the logindirectory, in which case it must be prefixed by "~/", for example:
remote ~/public_html/

Thelocalkey specifies the directory in which the site is stored locally. Thismay be given relative to your home directory (as given by theenvironment variable $HOME), again using the "~/" prefix.
local ~/html/foosite/
local /home/fred/html/foosite/
are equivalent, if $HOME is set to "/home/fred".

For both the local and remote keywords, a trailing slash may be used,but is not required.

FilePermissionsHandling

File permissions handling is dictated by thepermissionskey, which may be given one of three values:

to ignore file permissions completely (the default),
to mirror the permissions of executable files only,
to mirror the permissions of all files.

This can be used, for instance, to ensure the permissions of CGI filesare set. The option is currently ignored for WebDAV servers. For FTPservers, achmodis performed remotely to set the permissions.

To handle directory permissions, the key:
permissions dir
may be used in addition to apermissionskey of eitherexec,localorall.Note thatpermissions alldoes not implypermissions dir.

SymbolicLinkHandling

Symlinks found in the local site can be either ignored, followed, ormaintained. In 'follow' mode, the files references by the symlinkswill be uploaded in their place. In 'maintain' mode, the link will becreated remotely as well, see below for more information. The modeused for each site is specified with thesymlinksrcfile key, which may take the value ofignore,followormaintainto select the mode as appropriate.

The default mode isfollow,i.e. symbolic links found in the local site are followed.

SymboliclinkMaintainMode

This mode is currently only supported by the WebDAV driver, and willwork only with servers which implement WebDAV Advanced Collections,which is a work-in-progress. The target of the link on the server isliterally copied from the target of the symlink. Hint: you can useURL's if you like:
ln -s "http://www.somewhere.org/" somewherehome

In this way, a "302 Redirect" can be easily set up from the client,without having to alter the server configuration.

DeletingandMovingRemoteFiles

Thenodeleteoption may be used to prevent remote files from ever beingdeleted. This may be useful if you keep large amounts of data on theremote server which you do not need to store locally as well.

If your server does not allow you to upload changed files overexisting files, then you can use thenooverwriteoption. When this is used, before uploading a changed file, theremote file will be deleted.

If thecheckmovedoption is used, sitecopy will look for any files which have beenmoved locally. If any are found, when the remote site is updated,the files will be moved remotely.

If thecheckmoved renamesoption is used, sitecopy will look for any files which have beenmoved or renamed locally. This option may only be used inconjunction with thestate checksumoption.

WARNING

If you are not using MD5 checksumming (i.e. thestate checksumoption) to determine file state, do NOT use thecheckmovedoption if you tend to hold files in different directories withidentical sizes, modification times and names and ever move themabout. This seems unlikely, but don't say you haven't been warned.

ExcludingFiles

Files may be excluded from the files list by use of theexcludekey, which accepts shell-style globbing patterns. For example, use
exclude *.bak
exclude *~
exclude "#*#"
to exclude all files which have a .bak extension, end in a tilde (~)character, or which begin and end with a a hash. Don't forget to quoteor escape the value if it includes a hash!

To exclude certain files within an particular directory, simply prefixthe pattern with the directory name - including a leading slash. Forinstance:
exclude /docs/*.m4
exclude /files/*.gz
which will exclude all files with the .m4 extension in the 'docs'subdirectory of the site, and all files with the .gz extension in thefiles subdirectory.

An entire directory can also be excluded - simply use the directoryname with no trailing slash. For example
exclude /foo/bar
exclude /where/else
to exclude the 'foo/bar' and 'where/else' subdirectories of the site.

Exclude patterns are consulted when scanning the local directory, andwhen scanning the remote site during a --fetch. Any file whichmatches any exclude pattern is not added to the files list. Thismeans that a file which has already been uploaded by sitecopy, andsubsequently matches an exclude pattern will be deleted from theserver.

IgnoringLocalChangestoFiles

Theignoreoption is used to instruct sitecopy to ignore any local changes madeto a file. If a change is made to the contents of an ignored file,this file willnotbe uploaded by update mode. Ignored files will be created, movedand deleted as normal.

Theignoreoption is used in the same way as theexcludeoption.

Note that synchronize modewilloverwrite changes made to ignored files.

FTPTransferMode

To specify the FTP transfer mode for files, use theasciikey. Any files which are transferred using ASCII mode haveCRLF/LF translation performed appropriately. For example, use
ascii *.pl
to upload all files with the .pl extension as ASCII text.This key has no effect with WebDAV (currently).

RETURNVALUES

Return values are specified for different operation modes. If multiplesites are specified on the command line, the return value is inrespect to the last site given.

UpdateMode


-1 ... update never even started - configuration problem
0 ... update was entirely successful.
1 ... update went wrong somewhere
2 ... could not connect or login to server

ListMode(defaultmodeofoperation)


-1 ... could not form list - configuration problem
0 ... the remote site does not need updating
1 ... the remote site needs updating

EXAMPLERCFILECONTENTS

FTPServer,SimpleUsage

Fred's site is uploaded to the FTP server 'my.server.com'and held in the directory 'public_html', which is in thelogin directory. The site is stored locally in thedirectory /home/fred/html.

site mysite
server my.server.com
url http://www.server.com/fred
username fred
password juniper
local /home/fred/html/
remote ~/public_html/

FTPServer,ComplexUsage

Here, Freda's site is uploaded to the FTP server´ftp.elsewhere.com´, where it is held in the directory/www/freda/. The local site is stored in /home/freda/sites/elsewhere/

site anothersite
server ftp.elsewhere.com
username freda
password blahblahblah
local /home/freda/sites/elsewhere/
remote /www/freda/
# Freda wants files with a .bak extension or a
# trailing ~ to be ignored:
exclude *.bak
exclude *~

WebDAVServer,SimpleUsage

This example shows use of a WebDAV server.

site supersite
server dav.wow.com
protocol webdav
username pow
password zap
local /home/joe/www/super/
remote /

FILES

~/.sitecopyrcDefault run control file location.
~/.sitecopy/Remote site information storage directory
~/.netrcRemote server accounts information

BUGS

Known problems: Fetch + synch modes are NOT reliable for FTP. If youneed reliable operation of fetch or synch modes, you shouldn't beusing sitecopy. Try rsync instead.

Please send bug reports and feature requests to <sitecopy@lyra.org>rather than to the author, since the mailing list is archived and canbe a useful resource for others.

SEEALSO

rsync(1), ftp(1), sftp(1), mirror(1)

STANDARDS

[Listed for reference only, no claim of compliance to any of thebelow standards is made.]

RFC 959 - File Transfer Protocol (FTP)
RFC 1521 - Multipurpose Internet Mail Extensions Part One
RFC 1945 - Hypertext Transfer Protocol -- HTTP/1.0
RFC 2396 - Uniform Resource Identifiers: Generic Syntax
RFC 2518 - HTTP Extensions for Distributed Authoring -- WEBDAV
RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
RFC 2617 - HTTP Authentication
REC-XML - Extensible Markup Language (XML) 1.0
REC-XML-NAMES - Namespaces in XML

DRAFTSTANDARDS

draft-ietf-ftpext-mlst-05.txt - Extensions to FTP
draft-ietf-webdav-collections-protocol-03.txt - WebDAV Advanced Collections Protocol

AUTHOR

Joe Orton and others.
e-mail: sitecopy@lyra.org
www: http://www.lyra.org/sitecopy/

June 2001 sitecopy