wiki:startstop
Last modified 9 years ago Last modified on 11/16/11 16:01:25

Earthworm Module: startstop

Function

Starts & stops all EARTHWORM modules on a computer. This module is the core of the earthworm system.

Details

This program starts and stops an Earthworm system. It reads its configuration file which specifies the message transport rings to be created, which modules are to be run, and the names of the parameter files each module is to read on startup. The program is system dependent, and there are versions available for the Linux, SUN Solaris and Windows NT operating systems.

For startstop to work, it must know about the Earthworm environment. This is typically done by setting the environment variables within the environment/ew_* file specific to your platform, and then sourcing that file, or executing the cmd if you're on Windows. Startstop typically reads its configuration file from the EW_PARAMS directory (as defined in your environment) and creates the specified rings. It then starts each module as a child process, passing its configuration file name, and any other parameters as its command line paramters (argv, argc). Each module (child process) is started with the priority indicated in startstop*.d. Note that each module and each ring specified must be definined within earthworm.d or earthworm_global.d, which should be in the EW_PARAMS directory. The system continues to run until "quit<cr>" is typed in startstop's command window. Startstop then sets a terminate flag in each transport ring. Each well-behaved module (child process) should periodically check for the terminate flag, and exit gracefully if is set.

Note that two copies of startstop pointing at the same startstop*.d file are not allowed to run simultaneously. The second one started will fail and quit. (If you really want to do this for some reason, you'd need to make sure that you use all different rings in the second version, different ports for the modules, and a different startstop*.d file, specified as a parameter when starting startstop.)

If the user presses the "Enter" key while the startstop command window is Selected, or enters the command "status", startstop will print a status table showing various statistics for each module, including whether it is dead or alive. If a module is dead because it could not be started (for example, the executable's name were mistyped so the executable could not be found), it will be reported as NoExec.

Startstop will also react to 'restart' messages from statmgr. This is part of a scheme wich works as follows: A module may have the token "restartMe" it its .desc file (the file given to statmgr, which tells it how to process exception conditions from that module). If its heartbeat ceases, statmgr will send a restart request to startstop. Startstop will then kill the offending module, and restart it with the same arguments as it did at startup time. There are some system specific features, listed below:

Interactive commands

Startstop will repond to the following commands from the status console window. There are similar command line versions (e.g., quit and pau) of each command as well.

quit

Completely shuts down EARTHWORM and all modules/ rings.

Within startstop:

quit
  • Starstop will send all child processes (modules) a request to quit, and will kill them if they don't quit within 30 seconds or so. It will then shut itself down.
  • The command line equivalent to "quit" is called "pau".

pau

Completely shuts down EARTHWORM and all modules/ rings.

Pau is invoked from the command line (provided you have the earthworm environment variables set from the appropriate script in the environment directory) by typing:

pau

OR pau can take as an argument the location of a startstop*.d file.:

pau startstop_nt.d
  • The command line equivalent of "quit".
  • Pau uses startstop's configuration file, startstop*.d to shutdown the entire Earthworm system.
  • The pau command line utility sends an Earthworm TERMINATE message, which should result in the entire Earthworm system shutting down. Pau depends on the earthworm environment variables being set (from the appropriate script in the environment directory), or it depends on your giving the startstop configuration file as the one parameter.

pidpau

Given an Earthworm module process ID, pidpau stops it.

pidpau <pid>
  • The command line equivalent of "stopmodule".
  • The pidpau command line utility requests that a Earthworm terminates the particular module associated with the pid (Process ID) you give pidpau as a parameter. The module should stop itself if it's behaving properly.
  • The program attaches to earthworm's shared memory region(s) and sets the flag in the header to the given value (pid). This is intended to signal that pid to terminate gracefully.
  • If one isn't specified as an optional parameter, Pidpau uses startstop's configuration file, startstop*.d to enable it to attach to the shared memory regions.
  • Pidpau needs the process id (pid) of the module it's attempting to stop. You can find out the process id by getting the status from typing Enter in an interactive startstop session, or by running the 'status' command, or by looking at UNIX's process list (with ps) or by looking at Windows Task Manager with display of Process ID activated.

reconfigure

Allows adding new modules or rings to a running Earthworm.

Reconfigure is invoked from within startstop or from the command line (provided you have the earthworm environment variables set from the appropriate script in the environment directory) by typing:

reconfigure

OR

recon
  • The reconfigure program sends an Earthworm TYPE_RECONFIGURE message. Startstop will then re-read the startstop_nt.d, starstop_unix?.d or startstop_sol.d, and allocate any new rings and start up any new modules it finds in the new .d file. In the process it rereads the earthworm.d and earthworm_global.d, in the event that there have been new module IDs or new ring IDs added there.
  • As the final reconfigure step, statmgr? is restarted as well so it re-reads it's config file. Any modules that were added to startstop*.d should be added to the statmgr.d config file as well.
  • Reconfigure is a command you can run from within startstop, and it also is a standalone command-line tool. The standalone tool will show you a 'status' before and after the reconfigure.rererec
  • Note that modules that are already running will not be restarted and should not be affected.

restart

Allows manual restarting of individual modules.

Restart is invoked with the command:

restart <pid> or restart <module name>
  • The restart program sends an Earthworm TYPE_RESTART message with the specified pid(s) to startstop, for manual restarting of individual modules. The pid can be obtained from the status message printed by status. Startstop will then send the module a message to exit, and may try and kill it if it doesn't quit by itself in a certain period of time. Next startstop will attempt to start the process back up.
  • Note that the <module name> must be unique for this to work as an argument. The command line version can only accept the pid (Process Id) as an argument.
  • The module writes the TYPE_RESTART for each pid provided into the first ring created by startstop. For this reason, the startstop file must be named as per OS naming conventions, or passed in as the second argument after a -c flag is given.

StartstopConsole

Creates a command prompt window with access to startstop_service (Windows only).

  • The StartstopConsole connects to a running instance of Windows Startstop via named pipes and creates a Windows command prompt window. It's useful when running Startstop Service, and you need to apply command line tools like "restart", "reconfigure", "status", "pidpau" or "pau".
  • Configuration Hint: In order to get this to work on XP you need to disable the "XP Welcome screen" and disable "Fast user switching"

status

Displays information about the status of Earthworm, including a listing of the rings and of modules.

Within startstop:

status
  • Within startstop, status can be invoked by hitting the "Enter" key

stopmodule

Given an Earthworm module process ID, stopmodule stops it, and startstop marks it as "Stop" to prevent statmgr from restarting it.

From the command line or within startstop:

stopmodule <pid> or stopmodule <module name>

Also within startstop:

stop <pid> or stop <module name>

Use 'status' to find the PID for your module if you don't know it.

  • The stopmodule attaches to earthworm's first shared memory ring (first listed in starstop*d) and sends a TYPE_STOP message with a payload of the PID. This is intended to signal that processid to terminate gracefully.
  • Startstop sees the TYPE_STOP message and will 1) request the module shut itself down gracefully; that failing it will 2) hard kill the module. Startstop will mark this module as "Stop" after it has confirmed the process is dead. Startstop will not try to start the process back up, and statmgr shouldn't try to restart it either.
  • statmgr sees a TYPE_STOP message and sets a it's internal restart status for that module to STOPPED. Until statmgr sees a TYPE_RESTART message for a stopped module, it should not request a restart of the module.
  • Stopmodule needs the process id (PID) of the module it's attempting to stop. You can find out the process id by getting the status from typing Enter in an interactive startstop session, or by running the 'status' command, or by looking at UNIX's process list (with ps) or by looking at Windows Task Manager with display of Process ID activated.
  • Note that the <module name> must be unique for this to work as an argument. The command line version can only accept the pid as an argument.
  • If startstop is quit, and restarted (Earthworm shut down completely and then started), and you didn't manually remove the module you stopped from the startstop*.d, the previously stopped module WILL start up in the new session.
  • The command-line "stopmodule" should mark the module as intentionally stopped, showing up as "Stop" in the status listing. This differes from the command line tool "pidpau" which will simply kill a module. It won't be marked as "Stop" so if statmgr is set to monitor and restart this particular module a process killed by "pidpau" will get started back up again. A module stopped by "stopmodule" should not.
  • The module is stopped only for the duration that this startstop session is running! If you want to permanently stop a module, you'll also want to remove it from the startstop*d, and the statmgr.d files so it doesn't get started up next time around.
  • Warning: On Solaris there's a shell command called "stop". If you accidentally type "stop <pid>" instead of "stopmodule <pid>", Solaris won't be able to stop or restart the module in question. The solution: in a unix command prompt, "kill <pid>", then "restart <pid>", then finally "stopmodule <newpid>"

Solaris, Linux, Mac OS X versions

  • Solaris startstop reads a configuration file named startstop_sol.d'
  • Mac OS X and Linux startstop reads a configuration file named startstop_unix.d'
  • If a child process does not exit within a user specified time after the user types "quit<cr>" (or "stopmodule" or "restart"), startstop terminates the child process. Startstop will resort to a more draconian but reliable approach to quiting a module if the standard approach fails, but only if a command to do so is included in the configuration file.
  • The amount of CPU time used by each child process is listed in the process status table.
  • As of Version 3.0, Startstop can run in background. This modification was made by Pete Lombard at the University of Washington (see below: 'Instructions for Running Startstop in Background').
  • To run Earthworm as other than root, you must set the file charateristics (see below: 'Startstop File Characteristics').
  • For Mac OS X you must adjust the shared memory settings using the /etc/sysctl.conf file and rebooting. We recommend values like this:
kern.sysv.shmmax=16777216
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=16
kern.sysv.shmall=4096

Windows Service version

  • Windows startstop and Windows startstop service read a configuration file named startstop_nt.d'
  • If Windows starts up, and, for example, the binary executables for certain modules are missing or are misnamed, startstop will start up anyway. These processes will be shown with a nonexistent negative process ID, and "NoExec" as their status. If this process is restarted once the problem that caused the error has been fixed, the process ID will return to a normal ID, and the status will change to "Alive".
  • Startstop can be set to start automatically when Windows boots up (see below: 'Earthworm NT Autostart'), but probably better than doing that is to set startstop as a Windows service (see below: Earthworm Windows Service). Note if you set Startstop as a Windows service you'll need to use other command line utilities like 'status' and 'restart' to monitor and control earthworm modules since there's no interface to the Startstop service. You can run StartstopConsole in order to be able to connect to the session running earthworm, if you're not logged in as administrator (see above: StartstopConsole). You'll be able to start and stop Earthworm with the Windows Services Control Panel.

Instructions for Running Startstop in Background

To run startstop in background from the run/params directory with shells like csh, use a command like:

startstop >>& ../log/startstop_log &

This will collect the standard output and standard error from startstop AND all the modules controlled by startstop into the log file. If you don't want to save this output, you can replace "../log/startstop_log" with "/dev/null".

To get the status of Earthworm while startstop is running in background, use the Earthworm program "status". To stop a running Earthworm, use "pau". Both "status" and "pau" are used without any arguments.

NOTE: "status" and "pau" need the environment of the running Earthworm in order to communicate with it. If you change that environment, including the startstop_sol.d file, in anticipation of stopping and restarting Earthworm, then "pau" may not be able to tell the running Earthworm to shut down. If you specify an alternate command file when you run startstop, then you should also specify this command file for "status" and "pau".

Also, you should wait a few seconds after starting Earthworm before running "status", to let the Earthworm modules get going.

Startstop File Characteristics

For a more secure operation Earthworm should be run by a user other than root; however, some modules need to have root permissions to set their priority. To set it up to do this you should:

Login as 'root'.

cd /home/earthworm/vX.X/bin

chown root startstop

chmod 4775 startstop

You should now be able to run earthworm without being root.

Earthworm NT Autostart

Following are steps needed to have earthworm (EW) start up automatically when a PC running Windows NT Workstation 4.0 boots up.

Overview of the steps:

Create new user 'earthworm' Setup the PC so that it automatically logs in as 'earthworm' upon reboot Setup EW so that it starts up every time user 'earthworm' logs in Note: Selecting menu choices is denoted as follows:

Start->Programs->...

This would mean that the Start menu is to be selected first, then Programs, etc.

Details:

  1. Create new user earthworm
  2. Click on Start->Programs->Administrative Tools->User Manager

User manager window should come up

b.Click on User->New User New user window should come up.

Fill in the fields: Username: earthworm Password: earthworm

Make sure that the box labeled "User Must Change Password.." is NOT checked.

  1. Setup groups - click on the "Groups" Icon

Groups window should come up

Select all groups in the right hand screen, then click Add, then click OK. Back in the New User window, Click OK. Back in the User Manager, Click USER->EXIT.

  1. Automatic login
  2. Start the registry editor:

Click on Start->Run, type "regedt32" in the box, then click OK.

  1. Registry editor window should open. Click the Window titled

"HKEY_LOCAL_MACHINE"

  1. HKEY_LOCAL_MACHINE window should come to the foreground. Double click on the following:

Software->Microsoft->WindowsNT->CurrentVersion->WinLogon

  1. Now we need to create and/or set some values in the registry in the right hand section of the window.
  2. AutoAdminLogon

If the AutoAdminLogon exits: Double click on the AutoAdminLogon field. Enter 1 in the pop up window.

If the AutoAdminLogon does not exit: From the EDIT pull down menu, Select AddValue. In the pop up window, enter "AutoAdminLogon" in the ValueName? box. Choose 'REG_SZ' for the data type. Click OK. Enter 1 in the next pop up window; then click OK.

  1. DefaultUserName

Click on Edit->Add Value. A dialog box should come up. Fill in: Value Name: DefaultUserName click OK, another window comes up. Fill in: String: earthworm Click OK.

  1. DefaultPassword

Follow the step above to add another value:

Value Name: DefaultPassword String: earthworm

When all is said and done, the following values whould be listed inthe right hand window:

DefaultUserName earthworm DefaultPassword earthworm AutoAdminLogon 1

  1. Click on Registry->Exit
  1. Automatic Earthworm Startup during login
  1. Restart the PC. If everything in step 1 and 2 was done correctly, the PC will automatically log you in as earthworm.
  1. Make sure that the EW automatic startup script exists -- it should be called something like c:\earthworm\run\params\autostart_ew_nt.cmd.

Example autostart_ew_nt.cmd: call ew_nt.cmd call startstop.exe

  1. Click on Start->Settings->!Taskbar

!Taskbar window should come up. Click on the "Start Menu Programs" tab. Click on "Advanced". NT Explorer should come up.

  1. By double clicking on the appropriate directories, find:

c:\winnt\Profiles\earthworm\StartMenu->Programs->Startup

The right hand side window under Startup should be empty.

  1. Create a shortcut to the EW automatic startup script: In the right hand side window, click the right button, then Select New->Shortcut.

A new window should come up, in the box enter the full path to the autostart_ew_nt script, click NEXT, and then click FINISH. OR click on Browse to Select the path to the script using NT Explorer, highlight the autorestart_ew_nt.cmd, click OPEN, then NEXT, and then click FINISH. Last. then click OK in the TaskBar window.

NOTES OF CAUTION:

  • user earthworm must have a password in order for automatic login to work. This means that password cannot be blank.
  • make sure that when creating user earthworm it is added to all groups, particularly the administrator group, since earthworm will be the only user that can log in once the registry is changed.

CHANGING EARTHWORM VERSION AND CONFIGURATION DIRECTORY

Whenever a new version of Earthworm is installed, or a new configuration directory is desired, the autostart script must be updated. Basically, the file autostart_ew_nt.cmd mentioned above consists of the main command file ew_nt.cmd (which is located in the params directory of the current configuration), with the addition of the startstop command which starts earthworm automatically. Therefore, whenever ANY changes are made to the local of the active ew_nt.cmd, the autostart_ew_nt.cmd file should be update.

Helpful Hints