wiki:v6.3
Last modified 9 years ago Last modified on 01/17/12 12:56:45
Earthworm Release Notes V6.3

(Sept 26, 2005)


Release Notes:  Earthworm Version "working"
Cleaned out after release of v6.2 on 4/15/03

Modified Exlusively by Paul Friberg  (PAF) after 4/15/2004
to build interim version 6.3 release. Between 4/15/2003
and 4/15/2004 the notes were updated as part of regular
bug fixes and mods so any notes during that time are
attributable to the original author. After 4/15/2004
any notes are from PAF with respect to incorporating
new code and bug fixes between v6.2 and v6.3.

IMPORTANT NOTE ABOUT SCN:
*************************
This will be the LAST SCN version of earthworm and
the v7.0 will contain Location Codes for all programs.

NEW MODULES:
***********

startstop_service:
(added to CVS by Paul Friberg 6/29/2005, authored by Mark Morrison)
From the README.txt found in the src/system_control/startstop_sevice:
startstop_service is identical to the old startstop_nt, except that
it runs as a Windows service.  This means that the parameters are all the
same - read from startstop_nt.d - but that startstop isn't just executed
from a command window or via the scheduler.  Note that this version is
taken from startstop_nt, so I haven't compiled or tested any of this under
Solaris or other systems.  I believe it's generally felt, however,
that these modifications don't apply to Solaris.

q3302ew:
(added to CVS by Paul Friberg, authored by Hal Schehner of ISTI.com).
This is a EW module which pulls data from a Quanterra 330
digitizer and outputs trace_buf packets onto a ring, much like 	q2ew
does for older quanterras.

srparxchewsend - Symmetric Research Digitizer module in
data_sources for use with their PC-based A/D boards. For more info
see them at www.symres.com. This compiles under Linux and Windows,
but not yet under Solaris.

slink2ew - Chad Trabant contributed this application to pull data from
seedlink into Earthworm. The version provided with EW v6.3 is 
version 1.2.2 last modified by Chad on 2005.07.14
Here's the slink2ew note from the v7.0 release of EW for future reference:
	A seedlink client for acquiring data from seedlink servers (e.g. IRIS)
	and converting to tracebuf2 (or optionally tracebuf). Much more robust
	than liss2ew. Written by Chad Trabant of IRIS; modified for tracebuf2
	and debugged by Chad Trabant and Jim Luetgert.
	JHL 4/19/05

startstop_unix:
(created by Alexandre Nercessian), imported into Earthworm by PAF.
This version compiles and runs under Linux. It does not have correct
reporting of run time for a module, but otherwise works. It uses and
looks for a startstop_unix.d file (not a startstop_sol.d file as in the
previous version with ew6.2-lnx). PAF 7/21/2005

grf2ew - Generic Recording Format to earthworm. This is for DAQ systems
digitizers. This compiles under Linux and should under Windows too. I
only installed this in the release and did not test it. PAF 9/10/2005

MODIFICATIONS/BUG FIXES TO EXISTING MODULES:

startstop_nt:
Had a restart option added in by Murray McGowan of GSL (9/6/2005). This allows
a user to restart any module from startstop by typing "restart nnn" followed
by the module name or the process id.

startstop_unix/startstop_solaris:
Added in the same restart command as above. It's a nice feature to have. (9/7/2005)

libsrc/solaris/transport.c:
Added: #include 
Function tport_bufthr() now returns NULL.  WMK 3/22/04

libsrc/winnt/transport.c:
Added: #include 

libsrc/util/kom.c
Changed: "static struct k_buf" to "struct k_buf"   WMK 3/22/04

libsrc/solaris/sleep_ew.c:
Added: #include    WMK 3/22/04

gaplist:
Added a "Total Gap" column to the output tables.  "Total Gap" is the
total, in seconds, of the length of all gaps for a particular channel.
"Total Gap" gets reset to 0 when the day rolls over.  Also, added a
column named "Dead Time", which is the elapsed time since a channel
died.  If "Dead Time" is blank, the channel is currently alive.
"Dead Time" does NOT reset to 0 when the day rolls over. WMK 1/8/04

gaplist: Slight table reformat.  WMK 1/20/04

********************************************
libsrc/winnt/sendmail.c
Increased BUFFSIZE from 200 to 250, to accomodate more email recipients.
This is a kludgy fix.  It might better to allocate the buffer on the fly,
to accommodate any number of email messages.  Also, note that I did not
increase BUFFSIZE in libsrc/solaris/sendmail.c  WMK 6/11/03

evtK2ora:
Fixed endTime bug in makeEwSnippet(). Thanks Jim L.
Alex 5/9/03

shakemapfeed:
The last two arguments to ewdb_api_GetSMDataWithChannelInfo were
reversed. That meant that the number of returned SM messages would be
incorrect IF that number exceeded MaxDataPerEvent. The result would be an
out-of-bounds memory access and corrupted SM data. PNL 6/9/03

shakemapfeed:
Changed the sorting used in Filter_SM_data. Now SM messages are sorted by SCNL
instead of by idchan. That means that the output XML will have components of
one station listed consecutively, a big help for shakemap. PNL 10/30/03

shakemapfeed:
Changed the length of the long station name in the mapping file from 
20 to 50 characters.  LDD 2/5/04

skakemapfeed:  (EWDB API - Strong Motion) ewdb_internal_GetAllSMMessages()
(Note: This function is called by ewdb_api_GetSMDataWithChannelInfo())
Fixed a bug in PostGetAllSMMessages().  The idEvent field may be null 
for some records, and Oracle  does not appear to be writing a 0 value to 
the memory area for the null field, so that record assumes the 
idEvent of the previous record located at that point
in the internal buffer.  The result is data corruption, such that you get
messages that are unassociated with an Event, but they appear to be associated
with an event.
DK 02/12/04

shakemapfeed: Added new optional command, SMQueryMethod, to control how SM data
is requested from the DBMS. Default behavior is original 2-stage query (get
all SM data associated with this eventid, then get all UNassociated SM data
in a space/time box). If SMQueryMethod is non-zero, shakemapfeed will request
all SM data in a space/time box. This will return SM data associated with this
event, SM data associated with other events, and UNassociated SM data. 
Regardless of query method, post-query filtering will be performed on eventid,
external eventid, and external author.  SM data associated with a different
eventid but with the same author will be rejected.  Data with a different
author will be allowed - this is most likely data imported from another
network and its eventids are meaningless in our DBMS.  LDD 02/13/2004

shakemapfeed: Changed the auto-feed scheduling logic to allow for variable
time intervals between feeds to shakemap.  Feeds are scheduled at
fixed times after the arrival of the archive message by using the new
command, ScheduleFeed (one time per ScheduleFeed command, up to 20 allowed). 
Auto-feeds will continue at the last ScheduleFeed interval until UpdateDuration 
minutes after the event origin time.  ScheduleFeed replaces the original 
DelayFirstFeed and UpdateInterval commands. These original commands are still 
accepted and will set up the auto-feed schedule in the original manner in the 
absence of ScheduleFeed commands.  LDD 3/4/2004

k2ew:
The channel bitmap bug reported below has been fixed. K2ew now reads the
correct bitmap to determine which channels are streaming. PNL 6/9/03

liss2ew, ew2liss, dumpseed:
There was a bug on Microsoft based systems that caused the user defined
types BYTE, WORD, and LONG to be ignored.  This was a problem specifically
with the type WORD, which was changed from a signed 16-bit value to an
unsigned 16-bit value, causing many problems, most noteably with the 
signed seed varibles for sample rate and sample rate multiplier.  All 
liss based modules have had this problem corrected. JMP 06/17/2003

getfile_ew
There was a feature in getfile_ew that could cause more cpu to be
used than necessary by checking time since last heartbeat more
often than necessary.  Added sleep_ew(1000) within heartbeat
loop. MMW 06/29/2003

getfileII
Ported getfileII to FreeBSD.  Two new files are in the source directory:
makefile.bsd and socket_bsd.c.  No other changes to files. WMK 11/25/03

sudsputaway:
Added functions to properly initialize SUDS structures to 
Banfill's default values. CJB 10/3/2003

sudshead.h:
Updated to make more compatible with suds.h in current
release of PCSUDS. CJB 10/3/2003

file2ew: 
Fixed bug in raw2ew.c which caused file2ew to exit when the 
SuffixType command contained only 2 arguments.  The third 
argument (installation ID) was supposed to be optional, and 
now is!  LDD 10/3/2003

ew2file:
Fixed bug in timing of heartbeat file generation.
Had been using HeartBeatInt instead of HeartFileInt to control
the production loop.  Caused continuous heartbeat file generation
when HeartFileInt==0.  LDD 11/24/2003

sniffwave:
Changed to sniff headers of both TYPE_TRACEBUF and TYPE_TRACE_COMP_UA msgs.
Will only print data values for TYPE_TRACEBUF messages.  Changed to accept
the any of these strings as wildcards in SCN: "wild", "WILD", "*".
LDD 10/31/2003

vX.X/src/data_exchange/Makefile:
Added directives for STANDALONE_MODULES (right now getfileII and sendfileII)
so that their executables, which are created in their source directories, 
are also copied to the vX.X/bin directory.  LDD 11/27/2003
 
libsrc/util/rw_strongmotionII.c:
Changed rd_strongmotionII() to interpret both "-" and "?" as NULL strings
when reading the QID: line qid and qauthor fields.  Previously only
accepted "?" as a NULL string.  Change was required by CISN data which 
uses "-" as a NULL string in those fields.  LDD 2/13/2004

sendfileII:
Ported to BSD Unix.  Added a new makefile, named makefile.bsd.
To compile, type "make -f makefile.bsd".  Otherwise, everything is the
same as on Solaris.  WMK 2/17/04.

EWDB - Station Information:
Modified the two stored procedures that handle the processing of SCNL
from station list tools (stalist_XXX2ora) and automatic data (trace feed,
automatic pick feed, strong motion, etc.).
Modified the procedures to properly process the Location Code of the
SCNL.  Prior to this, Location codes weren't properly being handled.
Because this change is to SQL procedures and not C code, you must 
refresh the DB stored procedures by updating with the SQL scripts from
"working".  (run ewdb_load_external_utils.sql from schema/sql_scripts).
DK 02/17/04


CHANGES TO CONFIGURATION FILES and DESCRIPTOR FILES:
****************************************************

shakemapfeed.d: Added new optional command, SMQueryMethod, to control how SM
data is requested from the DBMS. Default behavior (SMQueryMethod 0) is original 
2-stage query (get all SM data associated with this eventid, then get all 
UNassociated SM data in a space/time box). If SMQueryMethod is non-zero, 
shakemapfeed will request all SM data in a space/time box. This will return 
SM data associated with this event, SM data associated with other events, 
and UNassociated SM data.
  Regardless of query method, post-query filtering will be performed on eventid,
external eventid, and external author.  SM data associated with a different
eventid but with the same author will be rejected.  Data with a different
author will be allowed - this is most likely data imported from another
network and its eventids are meaningless in our DBMS.  LDD 02/13/2004

shakemapfeed.d: New command "ScheduleFeed "
sets the shakemap feeding schedule. Up to 20 ScheduleFeed commands are
allowed, setting one time per command.  Auto-feeds will continue at the
last ScheduleFeed interval until UpdateDuration minutes after the event
origin time. ScheduleFeed replaces the original DelayFirstFeed and UpdateInterval 
commands. These original commands are still accepted and will set up 
the auto-feed schedule in the original manner in the absence of ScheduleFeed 
commands.  LDD 3/4/2004



KNOWN BUGS:
***********

liss2ew:
liss2ew has been observed intermittantly producing malformed    
TRACE_BUF messages.  Currently the conditions for causing this problem   
are unknown.  Due to this, liss2ew should be treated as suspect.  Use at 
your own risk.  JMP 6-18-2003                                            

k2ew: 
k2ew uses the k2hdr.rwParms.misc.channel_bitmap parameter to decide
which channels it will see as streams from the K2.  This is actually the
parameter which the K2 uses to decide which channels to record in an event
file (see K2's  command).  The K2's  is the parameter
that shows which channels the K2 is streaming out.  This is the parameter
that k2ew should be using to decide what data it will see.  The big problem
is that none of the header files from Kinemetrics (nkwhdrs.h) seem to 
show this parameter anywhere. This issue is only a problem for those folks
who want to record more channels on the K2 than they want to stream continuously.
Terry Dye (Univ of Utah) discovered this problem when trying to stream only
3 channels of a 12 channel K2.  LDD 4/11/2002    FIXED PNL, 6/9/03


wave_serverV:
it occasionally sends the following error to statmgr:
   UTC_Thu Sep 06 03:30:14 2001  WSV1/wave_serverV_nano ReadBlockData
   failed for tank [z:\nano57.tnk], with offset[999908] and record
   size[64]! errno[0] Mail sent.
The nominal tank size is 1 megabyte, and the actual tank size is 999908.
It looks like waveserver is trying to read off the end of the tank.
WMK 9/6/01

wave_serverV:
appears not to reply to requests for a single sample of data. I noticed 
when testing wave_viewer, that if the start time and end time of a request were 
equal (in which case there should be one sample of data) then wave_serverV did 
not reply to the request (ASCII request) at all. No Data, No Flags, No Reply, 
No Nothing.  It needs to issue a reply to every request.
DavidK 09/25/01

Automatic restarts of adsend (using the "restartMe" line in the descriptor
file) can cause an NT system to hang. Therefore, you should never
use the autorestart feature with adsend, but you should bring down
the entire Earthworm system if adsend needs to be restarted.
LDD 5/31/2000 Comments added to adsend.desc, but leave this warning here!

libsrc/utils/site.c: The strings used for station, channel and network are
required to be fixed length with trailing spaces added to short names. If
the strings given to site_index do not have these trailing blanks, SCN names
will not match. This is not documented anywhere.  PNL 10/15/00

socket_ew: (libsrc/util/socket_ew_common.c libsrc/solaris/socket_ew.c
            libsrc/winnt/socket_ew.c include/socket_ew.h)
Fixed a bug in the connect_ew function().
When run in non-blocking mode (clients connecting to
a server - using a timeout value), there was a bug in
the connection code(discovered on Solaris) that caused
the function to return a timeout-error when there was
any kind of error during connection.
The bug was discovered when connecting to a non-existent socket.
When trying to connect to a non-existent wave_server on a machine,
the underlying socket library was returning a Connection-Refused error,
but the socket_ew library was passing back a timeout error.
-
The change only applies when a socket-error occurs while the function is
waiting for the connect to happen.  The instance where you will most-likely
see a difference, is when you try connecting to a non-existent socket.
Previously the function would return TIMEOUT, now it will return
connection REFUSED.
-
Added a new function: socketSetError_ew(),
and defined a new socket_ew return code: CONNREFUSED_EW.
DK 2003/02/04

sm_ew2ora:
There is a bug in sm_ew2ora that involves having multipe time intervals for
components and channels.  If a strong motion message containing information
for a channel that the DB has never seen before is loaded into the DB, and
then later another message for the same channel with an earlier timestamp
is loaded, the load of the second message will fail, due to problems with
overlapping time intervals, and the call that sm_ew2ora uses to create
those time intervals.  This problem only affects stations that were
not previously loaded via one of the station loader programs stalist*2ora,
and only when receiving SM data that is timestamped with a time that is
prior to the original time for that channel.  The bug lies in the logic
of ewdb_api_PutSMMessage(), and not in the underlying code.
Davidk 05/25/01

A change was made to ewdb_api_PutSMMessage() that dramatically affects
sm_ew2ora.  Please see the note about that function.  Davidk 2001/07/26 

ewdb_api_CreateWaveform()  (ewdb_internal_CreateSnippet.c 
                            ewdb_internal_CreateWaveformDesc.c)
Added a call to release the cursors used by the internal functions, when they
fail.  Fixes a bug which resulted in a DB cursor leak when a call to stuff
a snippet into the DB failed.
DK 2003/02/04

logit.c:
1) Added a new function logit_core()
int logit_core( char *flag, char *format, va_list ap);
This function is the same as logit(), except that it takes a va_list argument
for the variable length parameter instead of '...', and it has an int 
return value.
This function can be called by other functions that receive a '...' variable 
argument list, where as logit() cannot.  logit_core() is to logit() 
what vsprintf() is to sprintf().
-
2) Moved all of the functionality in logit() to logit_core(), and modify 
logit to call logit_core().
-
3)  Added a new function get_prog_name2(), as an intended replacement 
of get_prog_name().  get_prog_name2() includes an additional
parameter (the buffer length of the output buffer).
-
4)  Modified logit_init() to use get_prog_name2() instead of get_prog_name()
-
All four(4) of these changes should be backwards compatible with all existing
earthworm code, and the modified logit.c has already been tested with several
earthworm modules.
-


ewdb_api:
Added new function ewdb_api_GetEventSummaryInfo() to retrieve summary
information for an event(including the preferred szSource and szSourceEventID
of the event).   011904 DK


KNOWS DEFICIENCIES:
*******************

ringtocoax:
In Windows NT, the time resolution of sleep_ew() is about 16 msec (one clock
tick).  On Solaris, the resolution is about 10 msec.  This is a problem for 
ringtocoax, since packet delays need to be set to a few milliseconds.

statmgr: A space is needed between "tsec:" and the value. 
If it isn't there, things fail. Artifcat of the kom routines. (Alex)

threads functions: The KillThread function on WindowsNT and Solaris
terminate the thread without ensuring that no mutexes are held. If a thread
holds a mutex when it dies, no other thread can get that mutex. PNL 1/12/2000

carlsubtrig:
The system time must be set to GMT and ew_nt.cmd must have 
TZ=GMT for carlsubtrig to work.  Comments in ew_nt.cmd done 5/25/00. Barbara
	
localmag:
needs GMT set on the system

ew2seisvole:
on NT, exits with horrible crash when system is stopped.

NUMBER OF RINGS LIMITED ON SOLARIS:
Under Solaris 2.6 (and probably other versions as well), the maximum number 
of shared memory segments is six. This means that on an out-of-the-box machine
you can only configure six rings. If you try to configure more than that, you
will see a cryptic message from tport_create about too many open files.  The
fix to this problem is to add the following lines to the /etc/system
file, and then reboot the system.

 set shmsys:shminfo_shmmax = 4294967295
 set shmsys:shminfo_shmmin = 1
 set shmsys:shminfo_shmmni = 100
 set shmsys:shminfo_shmseg = 20
 set semsys:seminfo_semmns = 200
 set semsys:seminfo_semmni = 70

This allows for 20 rings.

     Lucky Vidmar (7/6/2000)

startstop_solaris:
Fixed bug in call to logit_init().
The program was printing error messages:
  Invalid arguments passed in.
  Call to get_prog_name failed.
  WARNING: Call logit_init before logit.

startstop_solaris: 
There MAY be a problem with the signal that 
startstop sends to modules during the shutdown sequence. The shutdown 
sequence is started (after typing "quit" to startstop or running "pau")
by startstop placing a terminate message on all transport rings. Modules
should see this message and start their own shutdown. After a configurable
delay, startstop checks to see that all modules have exitted. Any that are 
still running are sent a signal to terminate them. Currently that signal
is SIG_TERM. But since wave_serverV has a handler for SIG_TERM, wave_serverV
sees that as essentially the same as a terminate message. So if wave_serverV
is having problems completing its shutdown, SIG_TERM won't do anything. The
result is that startstop may give up and exit, leaving wave_serverV running.
If that happens, the operator will have to terminate wave_serverV by doing
"kill -9 ". That may leave shared memory and semaphores
stranded in the kernel: run the command "ipcs -a" to see. If necessary,
the stranded shared memory and semaphores may be cleaned up with the
ipcrc command; must be run as root; see the man page.
This problem only exists on Solaris/Unix, not on WindowsNT.
PNL, 10/4/2000

libsrc/utils/kom.c:  fix comment in k_open()

The comment above k_open() says that only one file can
be open at a time. Yet the Kbuf array has slots for MAXBUF (currently 4) open
files. Does this work, or is the comment to be taken at it's word?
PNL 10/15/00

libsrc/utils/logit.c: logit_init() requires a module_id number, which it uses
to construct the log file name. This is not helpful, since the module_id
number is not meaningful to people. Worse, it requires that the config file be
read and earthworm.d lookups be completed before logit calls can be made. Thus
errors in the config file can only be reported to stderr or stdout instead of
being saved in a file.  PNL 11/29/00

libsrc/util/k2evt2ew.c: This library supports a maximum EVT data size
of 800000 samples per channel per EVT file.  This value is hardcoded
as MAXTRACELTH in include/k2evt2ew.h.  k2evt2ew() will return a warning
if the EVT file exceeds this size, and process as much as the EVT file
as the hardcoded limit allows.
DK 2003/01/18


TRACEBUF messages.
The definition of `endtime' of the TRACEBUF message is not documented.
Some programmers are taking it as the "expected start time of the next
TRACEBUF packet (if the sample interval is uniform.)" The more accepted
practice is that `endtime' is the time of the last sample of the current
TRACEBUF packet; that is, one sample interval less than the expected
start time of the next TRACEBUF messsage. Using this last definition, if a
TRACEBUF packet has exactly one sample, then its starttime and endtime are
the identical. Clearly this distinction needs to be documented. The file
waveform_format (in the /home/earthworm/DOC directory) gives no specifics 
about start or end times.  PNL 1/24/01