wiki:binder_ew

Version 7 (modified by branden, 9 years ago) (diff)

--

Earthworm Module: binder_ew

Contributed by: Lynn Dietz

Function

Associates P-arrivals into events

Details

Given a list of P-wave arrival times, a set of station locations and a simple velocity model, binder's job is to identify the smallest set of hypocenters that could have produced the current list of arrival times. With each new pick, binder tries to:

1) associate that arrival to an active event, or

2) identify a new event by stacking the back-projections of P-wave travel times on a space-time grid to find a likely source (hypocenter) for recent unassociated picks.

Sounds simple, right? Well, there are a lot of configurable parameters that control exactly how binder will go about this task. I'll try to explain the basic steps I go thru when I'm setting up binder for the first time in a new region of interest. And I'll give a few details about how binder goes about its tasks (refer to this flowchart as we go).

Getting Started: Earthworm Setup & Network Definition

The three most basic things binder needs to know are:

  1. Where should it get P-wave arrivals (picks) to work on?
  1. What stations are being picked and where are they?
  1. What's the velocity structure like?

1. P-wave arrivals

Two commands in binder's configuration file tell it where it should get the P-wave arrivals to work on. The "RingName" command specifies which Earthworm transport ring (a shared memory region) to read from. Make sure that this is the same ring into which pick_ew is writing its output. Also, if you're importing picks from another source and want binder to use them, make sure you are placing them in this ring as well.

The "GetPicksFrom" command specifies the installation id and module id of the TYPE_PICK2K messages that binder will operate on. Specific id's or wildcards can be used for either or both of these arguments. Note: both binder_ew and eqproc should be set to listen to the same pick source(s)!

2. Station List

Binder expects the network's station list to be in a file in Hypoinverse station format; it gets the name of the file from the "site_file" command. Of all the fields in this file, the only ones used by binder are the station codes (5-letter site code, 2-letter network code, 3-letter component code) and the station latitude and longitude. A quirk of this file is that all lats and lons are listed as positive numbers; a letter is used to denote N,S,E,W, with blanks interpreted as N and W. As binder reads the file, it converts lat and lon to decimal degrees with N and E being positive. If a site has multiple components, each component should be listed on its own line in the station file. As of Earthworm v4.0, binder treats each component independently, so if multiple components at a site are picked, that site will be weighted more heavily in stacks and locations than a single-component site.

Large networks (over 1000 channels) will have to use the "maxsite" command to increase the number of stations that binder will input and use.

Since binder works only on arrival times, it doesn't need to know anything about instrument response or gain.

3. Velocity Model

Binder uses one simple velocity model to calculate travel times for the entire region of interest. If the velocity structure varies greatly in your region, make sure you give binder a decent average model. This velocity model is specified with a series of "lay" commands in the configuration file, listed in order of increasing depth. The velocities in each layer are constant, and should increase with depth (no low velocity zones, please). Pay attention to the order of the arguments of the lay command. It should be "lay depth(km) velocity(km/s)". If you get depth and velocity swapped, binder won't do much of anything.

Binder also uses one P/S velocity ratio; the default is 1.72, but it can be changed by using the "psratio" command in the configuration file.

Stacking - The Guts of Binder

How does binder identify new earthquakes, or what does "stacking" mean? Binder needs to find the likely origin, in space and time, of a group of P-wave arrivals. Given a single P-wave arrival (call it the "initiating pick"), binder considers the entire volume of the earth is a potential hypocenter. The addition of a second P-wave arrival gives binder information on relative timing; so it can narrow the field of possible hypocenters to a 3-dimensional surface. To determine this surface, binder essentially takes the P-wave travel-time curve, puts the curve's origin at the station's location, flips the curve upside down (about a horizontal axis), shifts it in depth considering the relative timing between the two picks, and spins the curve around a vertical axis. This creates a conic-like surface (somewhat like a hollyhock flower opening down) that passes through all possible hypocenters which could have produced that second arrival, given the timing of the first. Binder calculates the "hollyhock" of potential hypocenters for each subsequent pick in the same manner, always using to the initiating pick's time as the reference time. When the "hollyhocks" from two arrivals overlap, you get a line of potential hypocenters that is consistent with both picks and the initiating pick. The point where three or more individual "hollyhocks" intersect is a unique hypocenter (stacking location) that can explain all of those arrivals, including the initiating pick. At this stage, with four P-wave arrival times, binder has successfully "stacked" a new "active event." It associates (or binds) these four picks into an event and passes those picks and the stacking location to its simple L1 locator for further processing.

Instead of considering an infinite number of possible hypocenters, binder discretizes the problem by using a stacking grid. Within its boundaries, the grid contains uniformly-sized cubic cells. Only the center point of each grid cell is considered a possible hypocenter in the stacking process described above. Below I'll describe the parameters that are used to define the stacking grid and control the stacking process.

While setting up the stacking grid, I find it very useful to look at a map of the seismic network and historic seismicity in the region of interest. The things I look for are:

  • seismic network: overall distribution & station spacing
  • seismicity: latitude, longitude & depth ranges

Stacking Grid Boundaries

The boundaries of the stacking grid are set with three commands:

  • "grdlat" (latitude range in decimal degrees, positive = north),
  • "grdlon" (longitude range in decimal degrees, positive = east),
  • "grdz" (depth range in kilometers, positive = down).

Reviewing the map of historic seismicity may help in setting the stacking grid boundaries. Binder can only identify new hypocenters that lie within these boundaries, so they should include the entire volume of interest (ie, wherever you expect or hope to see earthquakes). Also, all of your seismic stations should lie within the grid boundaries.

Stacking Grid Cell Size

Now you must decide how big each cell within the grid will be. Each cell is a cube, and its side-length (in kilometers) is set with the "dspace" command. That map may be useful again; hopefully, you can set dspace so that only one station falls within each grid cell. The more stations that fall within one cell, the more trouble binder will have in "focusing" the stack (when 2 stations are in the same grid cell, their "hollyhocks" will overlap 100%, which doesn't do much good for resolving potential hypocenters). Then again, you don't want to make the cell size too small, or binder will have a lot more computations to do. So there's a tradeoff here. FYI, the Northern California Seismic Network uses a grid cell size of 4 km.

Stacking Distance Constraints

When stacking each additional pick, binder considers possible hypocenters for that arrival within a limited distance of the station, not the entire stacking grid. The stacking distance is controlled by the "rstack" command. Binder computes each pick's "hollyhock" within its stacking volume. The stacking volume includes the entire depth range of the stacking grid, but in map view the stacking volume is a square centered on the station location, extending rstack kilometers in each horizontal direction (the square's sides are 2*rstack km long). To avoid roundoff problems, rstack should be a multiple of the grid-cell dimension, dspace.

Once again, the network map will be handy for choosing an appropriate rstack. Consider your network's station spacing and distribution - phases from stations that are more than 2*rstack km apart will not stack together other because their stacking volumes will not overlap at all. Also consider your expected seismicity patterns - binder can only identify new hypocenters where the stacking volumes of 4 or more stations overlap. Make sure that your expected seismicity falls within such a region, increasing rstack if necessary. After an event is stacked, binder's L1 locator may move the hypocenter beyond the stacking volume, and possibly even outside the stacking grid boundaries.

But, you have to stack it first!

Stacking and Time Resolution

Binder includes a time-resolution factor ("tstack" command) in calculating the set of all possible hypocenters ("the hollyhock") for each pick. This time factor controls the thickness of the hollyhock petals. Increasing tstack could increase the number of grid cells each pick's hollyhock. As a first approximation, tstack should be at least the number of seconds it takes for a P-wave to cross a grid cell of dspace km.

Pre-Stack Glitch Filtering

A glitch is a group of coincident arrivals often caused by noisy telemetry. For the purposes of this filter, a glitch is defined as a certain number of picks (m) occurring in a given number of seconds (x). The values of both m and x are configurable with the "define_glitch" command (the default definition is 4 picks within 0.035 seconds). This glitch filter was added to the stacking portion of binder to prevent it from creating new events from a group of picks having nearly identical arrival times. Picks that are flagged as belonging to a "glitch" are not used in summing the stack, but may later be associated with an event. If you want to turn off pre-stack glitch filtering (ie, to make binder stack every pick), set m to zero (example "define_glitch 0 0.0").

Summing the Stack - Stacking Weights & Event Thresholds

OK, here we go. Binder gets a new pick and it doesn't associate with any of its currently active hypocenters. Binder will now attempt a stack with that new pick as the "initiating pick". First it gathers up all recent unassociated picks (binder keeps track of the last 1000 picks it has processed, but it only tries to stack with a limited number, set with the "stack" command). Then it reviews that list of unassociated picks to see if there are any glitches (see section above). Any picks that belong to a glitch will be eliminated from the stacking process. If the initiating pick itself belongs to a glitch, binder will not attempt a stack at all.

Now let's talk about stacking weights. The stacking weight of each pick is based on the quality assigned to that pick by the pick_ew process. By default, binder stacks only quality=0 picks with a weight=1. Using "grid_wt" commands, you can include other quality picks in the stacking process. You could also give high-quality picks greater weight in the stack than low-quality picks. This differential weighting is especially useful for large networks like NCSN; we want to avoid generating events based on a few low-quality picks, but will accept an event that's based on a larger number of low-quality picks (or a combination of high-and low-quality picks). For small networks (10s of stations), I would suggest using equal stacking weight for all quality picks. Just be sure to use a "grid_wt" command for each pick quality (including 0) that you want binder to include in the stack.

The "grid_wt_instid" is a new twist (June 2002) to the stack weighting scheme. It allows you to set the stacking weight of a pick based on its pick quality and its installation_id. For example, this allows a user to weight a quality 2 pick from one installation differently than a quality 2 pick from a different installation. This is only useful for networks that exchange pick data with other institutions.

To sum the stack, each grid cell is assigned a "hit count" to keep track of how many picks that cell is a potential hypocenter for (how many "hollyhocks" that cell is a member of). During the stack, binder only keeps track of the grid cells within the initiating pick's stacking volume. At the beginning of a stack, the hit count of every cell in the initiating pick's stacking volume is set to initiating pick's stack weight. Then binder calculates the hollyhock for each additional pick; for each grid cell that lies within the new pick's hollyhock and the initiating pick's stacking volume, the hit count in incremented by the stack weight of the new pick.

After binder sums the stack using all picks in its short list, it reviews the hit counts of all the grid cells. Binder prepares a list of cells that share the maximum hit value and the same minimum distance to the initiating phase. Binder declares a new hypocenter from the stack only if the maximum hit count is greater than or equal to an event threshold value (set with the "thresh" command) and the number of cells in the list is less than an event focus value (set with the "focus" command). The trial hypocenter for the declared event is calculated by averaging the center positions of all the cells in the maximum-hit-count list. Each pick that contributed to thatmaximum hit count is associated with the new event.

It's very important that at least 4 phases contribute to declaration of a newly stacked earthquake. If all picks are being weighted equally (no "grid_wt" commands were used), the event threshold ("thresh" command)should be set to 4 or higher. If differential stacking weights are being used, the event threshold should be greater than 3 times the maximum stacking weight specified by "grid_wt" commands.

Finding the optimum value for the event focus value ("focus" command) may require considerable experimentation for any specific network. The optimal value will depend somewhat on the stacking grid granularity ("dspace" command). Since the trial hypocenter is calculated by averaging the center positions of all the cells in the maximum-count list, a larger focus threshold will increase the probability of a poor starting location (which has its own attendant problems). A smaller focus value, however, will tend to reject distant events.

Stacking and Eventids

Every time binder identifies a new event with a stack, it assigns an eventid to it. Binder uses this eventid in its internal bookkeeping to know which picks are associated with which events. It also writes the eventid in the TYPE_LINK and TYPE_QUAKE2K messages that it outputs, so that downstream modules (eqproc, eqprelim) can also do their own bookkeeping.

Binder keeps track of the next available eventid in a file called "quake_id.d" that is written in the EW_PARAMS directory. The actual command written to the file is the "next_id quakeid" command. This file is read once on startup (actually, the first eventid used after startup is quakeid+1), and is updated after every new event is stacked. To ensure sequential, non-duplicated eventids between runs, binder's configuration file must contain the command "@quake_id.d". If this command is omitted, event sequence numbers will be issued beginning at 1 each time binder is restarted.

You should never see the same eventid twice, but you may see gaps.

On startup, binder always skips one eventid (just in case it died in the middle of writing the quake_id.d file), and it may kill events for various reasons. Also, the eventids may appear downstream out of numerical order. Sometimes a smaller event that occured after (thus had a higher eventid) a large one will finish first.

Associating New Phases with Active Earthquakes

Whenever binder retreives a new pick, the first thing it does is try to associate the arrival time with one of its currently active earthquakes. Even though the Earthworm pick_ew module supposedly only picks P-wave arrivals, binder assumes the pick is an unknown phase during the association process. Knowing each event's origin time and location (binder keeps track of the most recent 100 hypocenters), the station location, and the velocity structure, binder calculates the arrival times of each event's possible phases (P, Pg, Pn, S, Sg, Sn) at that site. Then it compares the actual arrival time to the calculated times to find the residuals. Any phase with a residual within the configured residual-tolerance qualifies that pick to be associated with that event as that phase. The phase having the lowest residual value (within the residual-tolerance) wins! Binder associates or links the pick with the event with the winning phase, then it hands that event to the L1 locator for further processing. If no phase has a residual within the residual-tolerance, binder performs a stack with the new pick as the initiating pick (see stacking section above).

Association and Time Constraints

As I mentioned above, binder keeps track of the 100 most recent hypocenters. When it tries to associate a new pick, it doesn't actually consider all of those hypocenters as possible sources for the pick. Instead, it considers only those hypocenters where the pick time is within a given time range of the origin time. The "t_dif tmin tmax" command defines the acceptable time range (tmin, tmax are in seconds). If the pick time is earlier than (origintime + tmin) or later than (origintime + tmax) for a given event, binder will not attempt to associate that pick with that event.

Association and Distance Constraints

Binder keeps track of the average epicentral distance (rAvg) of all phases associated with each active hypocenter in its list. Before binder attempts to associate a new pick with a given hypocenter, it compares the epicentral distance of that station with a cutoff distance based on the rAvg value for that hypocenter. If the new pick's epicentral distance is greater than the cutoff distance, binder will not attempt to associate the new pick with that hypocenter. The cutoff distance is calculated by rAvg*rAvg_Factor, where rAvg_Factor is a constant (default value = 10.0) that can be set with the "rAvg_Factor" command. This distance cutoff is important in cases where two earthquakes occur with seconds of one another in different parts of the network. It also keeps an event from being contaminated by distant spurious noise picks which happen to match the event's travel time curve.

Configuring the Travel-time Residual Tolerance Function

Whether or not a given phase is associated with, culled from, or scavenged for a given event is determined by how closely its arrival time fits the predicted arrival time for the event (ie, how large its travel-time residual is). The overall maximum-allowed travel-time residual (residual tolerance or restpr) for a pick to be associated with a given active hypocenter is composed of three terms:

   restpr = dist-taper + res1_OT + res2_OT

The first term (dist-taper) is based on the distance of the current pick to the hypocenter (controlled with the "taper" commands) and two remaining terms (res1_OT and res2_OT, controlled with the "taper_OT" command) are based on the quality of the event's location. All of these terms are configurable to allow you to widen the residual tolerance for more distant picks and/or for more poorly located earthquakes.

Residual Tolerance Distance Term, dist-taper

The distance-dependent term of residual tolerance, dist-taper, is controlled with the "taper r resmax" command, where r is the epicentral distance (km) and resmax is the value of the dist-taper term (seconds). Multiple taper commands (up to 100) can be used, in order of increasing r, to define the dist-taper term of the tolerance function. Between the distances specified, the value of the dist-taper term changes linearly between corresponding resmax values. For distances less than the lowest value specified, the dist-taper term is constant at the resmax value for the smallest distance; for distances exceeding the maximum specified, the dist-taper term is constant at the resmax value set for the maximum distance. If no "taper" commands are supplied, the dist-taper term is a constant 1.0 sec for all distances.

The dist-taper function defined by the "taper" commands also plays a role in how a pick is weighted in the relocation process (see the distance-weight factor in the "Relocation and Pick Weighting" section below).

Residual Tolerance Hypocentral Quality Terms, res1_OT and res2_OT

The "taper_OT OTconst1 OTconst2" sets two constants that are used to calculate the hypocenter-quality terms of the maximum-allowed travel-time residual, restpr. The thinking is, if your location is poor, you'll want to accept a phase with a higher travel-time residual. If no taper_OT command is listed, the default values are OTconst1 = OTconst2 = 2.0. To eliminate the two hypocentral quality terms from the residual tolerance function, set both constants to 0.0.

The res1_OT term is based on event's rms residual and the number of phases associated with the event. Binder uses the first constant, OTconst1, in calculating res1_OT for each hypocenter as follows:

   res1_OT = OTconst1 * rms * Npick_factor

   where  Npick_factor = 3.0                     when npick <= 4
                       = SQRT( npick/(npick-4) ) when npick  > 4

When the number of picks is small, the rms of the location is often quite low, so the Npick_factor and OTconst1 will dominate the res1_OT term. As the number of picks associated with the event increases, the Npick_factor approaches 1, so the event rms and OTconst1 will dominate the res1_OT term.

The res2_OT term is based on the hypocentral distance to the nearest station and the second constant, OTconst2. Given the hypocentral (slant) distance to the nearest station calculated by:

   s = SQRT( z**2 + dmin**2 )  where    z = hypocentral depth
                                     dmin = epicentral distance to
                                            nearest station

binder calculates the res2_OT tolerance term as follows:

   res2_OT = 0.0                     when s  < 1.0
           = OTconst2 * log10( s )   when s >= 1.0

The larger the minimum slant distance, the more unreliable the location, and the larger this second residual taper term becomes.

OK, so that describes the form of all three terms in the residual tolerance function, but it doesn't really give you a clue as to what values to use for your network. I think it's most important to set a reasonable distance taper for your network based on how well your velocity model fits your arrival-time data. Hopefully, you already have some idea about that. The residual tolerance can then be widened if necessary using the origin uncertainty terms. The values of all three terms are printed in binder's logfile, so you may be able to empirically settle on some parameters by studying those.

Binder's Simple Event Locator

After binder declares a new event from a stack, or after is associates a new pick with an active event, it passes the event to its simple L1 locator for further processing. The locator takes a starting location and all the phase data for the event, determines a weight factor for each phase in the relocation, and performs a number of iterations to arrive at a new location. Again, there are quite a few configurable parameters available to control the locator, and I'll describe them below.

Intermittent Relocation

While binder's locator is simple, it can still be quite a CPU hog, especially when a large number of phases are associated with the event. Also, as the number of associated phases grows, the hypocenter becomes more stable, and the improvement in the location with each additional phase shrinks. Therefore, we added the configuration command "locate_eq npick interval" to allow intermittent relocation of an event. Instead of relocating after every single new phase is associated, an earthquake with more than "npick" picks will relocate with every additional "interval" picks associated. Up to 10 "locate_eq" commands can be supplied (in order of increasing "npick" values) to set multiple npick-break-points and location intervals. An event with fewer picks than the smallest "npick" value will locate after each additional pick is associated. If no "locate_eq" commands are used, every event will be relocated after each addition pick is associated.

Relocation and Pick Weighting - Phase Type, Pick Quality, and Distance

The overall weight of a pick in the location process is the product of three weighting factors: a phase-weight, a pick-quality-weight and a distance-weight.

     wt = phase-weight * pick-quality-weight * distance-weight

The phase-weight is based on the phase assigned to the pick by binder, in conjuction with the "ph" commands listed in the configuration file. Use one "ph phase weight" command to assign a "weight" (any number from 0.0 to 1.0, inclusive) for each "phase" (any of the strings P,S,Pg,Sg,Pn,Pg) that you want binder to use in the location. If at least one "ph" command is supplied, other phases that were not listed in a "ph" command receive a phase-weight of 0.0 (ie, they won't be weighted in the location process). If no "ph" commands are used, all phases receive a phase-weight of 1.0.

The pick-quality-weight is based on the pick-quality assigned by pick_ew, in conjuction with the "wt" commands listed in binder's configuration file. Use one "wt quality weight" command to assign a "weight" (any number from 0.0 to 1.0, inclusive) to any "quality" (0,1,2,3 or 4) of pick that you want binder to use in the location. If at least one "wt" command is supplied, other pick-qualities that were not listed in a "wt" command receive a pick-quality-weight of 0.0 (ie, they won't be weighted in the location process). If no "wt" commands are used, all quality picks receive a pick-quality-weight of 1.0.

The distance-weight is based on the epicentral distance of the station from the current starting location and the "r_max" and "taper" commands listed in the configuration file. The "r_max" command is used to set the maximum epicentral distance for weighting picks in the location process. Any pick at a distance greater than r_max km from the starting epicenter will not be used in relocating the event. For picks within r_max km of the epicenter, the distance-weight is a function of the current distance, r, and the distance-taper function set with the "taper" commands (see the "Residual Tolerance Distance Term, dist-taper" section above):

  distance-weight = ((min dist-taper value)/(dist-taper value at r))**2

The overall weight of a pick in the relocation process will be a number between 0.0 and 1.0, inclusive.

The total number of phases binder can use in relocating an event is limited using the "maxpix" command. The default value (256) should be plenty for most networks.

Relocation and Iteration Control

Binder hands its locator a starting location and a list of phases. The locator determines the pick weights as above, then calculates the event's average weighted travel-time residual (rms) to use as a reference for testing divergence.

For most earthquakes, the first iteration performed by the locator will be a "Free" solution. It will invert for all four hypocentral parameters: latitude, longitude, depth, and origin time. However, if an event has no nearby picks associated, the locator will perform "Fixed Depth" solution instead. This feature is controlled with the "FixDepth dmin trialz" command (default dmin=50.0, trialz=8.0). If an earthquake has no picks closer than "dmin" km, binder sets its depth to "trialz" km, then inverts for only latitude, longitude, and origin time. This prevents events with no depth resolution from flailing.

The result of the locator's inversion is the "step-vector." This is the amount by which the iteration will change all 4 hypocentral components. Binder places limits on the horizontal (xystep) and vertical (zstep) components of the step-vector in any given iteration. Those limits are set with the "MaxStep xystep zstep" command (default xystep=10.0 km, zstep=2.0 km). If either step-length is exceeded in an interation, all 4 dimensions (x, y, z and time) of the step-vector are equally damped, preserving the direction of the vector, such that each dimension is within its configured limit.

Binder also place limits on the valid range for computed hypocentral depth. This range, specified with the "zrange zmin zmax" command (default range is 2.0 - 20.0 km), should span the depth range in which you expect to detect earthquakes. It must also span the entire depth range of the stacking grid ("grdz" command); otherwise, events may stack outside of the zrange limit and this could cause locator logic errors. If during an iteration, the step-vector causes the depth to fall outside of the zmin-zmax range, that step vector is abandoned. The locator then "restarts" the iteration; beginning with the previous starting hypocenter, the locator performs a fixed-depth solution. This feature is useful in controlling locations from network "glitches" where all picks are coincident and the apparent high phase velocity would try to locate the quake at the center of the Earth.

At the end of each iteration, binder recalculates the event's average weighted residual (rms) and performs a test for divergence. If the new rms is greater than the previous rms*"constant" (set with "MaxDeltaRms" command, default = 1.01), then the solution is diverging. If the location is converging, the locator goes on to the next iteration. If a location is diverging, binder removes half of the step-vector in all 4 dimensions (x, y, z and time), backing the location up toward the starting location from that iteration, and then checks the rms again. If the solution now converges, the locator goes to the next iteration. If the solution is still diverging, it backs the location up farther. Up to 3 backwards steps will be made (resulting in a step-vector of 1/8 the original size) if the solution continues to diverge.

Binder's locator will perform a maximum number of iterations every time that it's called. The maximum iteration count can be set with the "MaxIter" command (default = 4). However, the locator may perform fewer than MaxIter iterations if the change in the location (in X-Y-Z km) in any given iteration is less than a minimum step-length (set with the "MinXYZstep" command; default = 0.1 km).

Post-Relocation Event Review

Pick Culling, Pick Scavenging, and Killing Events

Binder has just relocated an earthquake. Presumably, the location is better contrained that it was before, and the arrival time fits may have changed as well. Binder now recalculates the travel-time residuals for all phases associated with the event, regardless of whether the phase was weighted in the relocation or not. It also calculates the event's average weighted residual (rms), and performs a series of tests.

Pick Culling For each associated pick, binder compares the new travel-time residual to the new residual tolerance value for that station (see the "Configuring the Travel-time Residual Tolerance Function" section above). If the new residual is larger than the tolerance, binder deassociates or culls the phase from the earthquake. That pick is now free to associate with another event, or to stack with other picks to identify a new earthquake. If any picks are culled from an event, that event is sent back to the locator once again.

Pick Scavenging After an event is relocated, binder also looks through its list of recent picks to see if there are any others which may fit the travel-time curves of the new hypocenter. Essentially, it does the association process (comparing the travel-time residuals of various phases to the residual tolerance at that station) on picks that are in already its memory; binder is scavenging. Binder is allowed to scavenge two classes of picks in its list: those that are not associated with any active event (also known as "waifs") and those that are associated with events that have fewer phases than the scavenging event. So big events are allowed to steal phases from smaller events, as long as the travel-time residuals fit. Since a pick can only belong to one event at a time, this scavenging feature makes it possible for two events to merge into one. This is useful in avoiding "split events", where a large event is originally identified as two events due to the order in which its arrivals were processed.

If an event successfully scavenges one or more new phases, the event is sent back to the locator once again. Also, if any of the scavenged phases had been associated with a second event, that second event is also sent back to the locator.

Killing Events In reviewing events, binder sometimes decides that an event is no longer valid, either because its location parameters are poor, or because it doesn't have enough supporting phases. If after a relocation, an event's rms is larger than a cutoff value (configured with the "rmscut" command, default = 1.0 sec), binder marks that event as destroyed or killed. Or, if after a round of scavenging, an event has fewer than 4 weighted phases (too few to relocate the event), binder kills that event as well.

When binder kills an event, it deassociates or unlinks all of its phases. Those picks are then free to associate with other active events as waifs, or to stack with other picks to identify new events. To notify downstream modules that the event was killed, binder issues a TYPE_QUAKE2K message for that eventid, setting the "number of associated phases" field to zero.

Quality Control: Pick-group Assessment

Occassionally one of binder's newly stacked events will include an outlying pick which falls at a "leverage point" for the hypocenter produced by binder's locator. This bad pick has undue influence over the hypocenter, causing a poor location which in turn leads to the mis-association of other picks. Binder uses a test called "pick-group assessment" to identify and eliminate these outlying picks, and to produce a trial hypocenter for the locator. Given a group of supposedly related picks, pick-group assessment tests whether all of the arrivals are indeed consistent with a single earthquake.

Pick-group assessment is most efficient when it is performed on an event with relatively few associated phases. The configuration command "assess_pk minpk maxpk" controls when the test is run; when an event has between minpk and maxpk (inclusive) picks associated with it, its pick set will be assessed before the event is handed to the locator (defaults are minpk=6, maxpk=12). If an event stacks with more than maxpk picks, its pick set will be assessed at least once. To turn off pick-assessing totally, set minpk and maxpk to zero.

Only P-wave arrivals are used in pick-group assessment. First, the set of picks is sorted in increasing time order and is examined for glitches. Glitch picks are eliminated from further testing (see the "define_glitch" command described in the "Pre-Stack Glitch Filtering" section above). The remainder of the test involves a number of resampling trials. In each trial, four picks are chosen from the pick set and the exact four-station hypocenter is calculated using a uniform velocity halfspace (configured with the "v_halfspace" command, default = 5.0 km/s). Next, binder calculates the travel-time residual with respect to that hypocenter for every pick in the pick set (again using the uniform velocity halfspace). You can specify the maximum pick quality (assigned by pick_ew) to use in this resampling process with the "maxwt" command (valid values are 0-4, default = 3). Picks with qualities higher than maxwt (poorer quality) may be rejected as glitches, but will not be used in resampling trials.

The number of trials used in the pick-group assessment is controlled with the "maxtrial" command (default = 500). Given n picks sampled 4 at a time, the total number of unique combinations is n!/(4!*(n-4)!).

If this number is less than maxtrial, all possible combinations will be used, with no duplication. If the total is greater than maxtrial, then maxtrial random samples of 4 picks will be taken, and duplication is possible.

Statistics on the residuals from all of the trials are used to determine if each pick should be accepted or rejected. If a pick is an outlier, it will have very large residuals for any of the hypocenters determined from 4 other picks. Also, the scatter in its residuals will be large. For each pick, binder calculates the median residual (med), its absolute value (absmed), and the median of the absolute deviations from the median residual (mad). If either the absmed or mad is greater than a configured cutoff threshold, the pick is rejected as an outlier. The cutoff values are configured with the "residual_cut maxabsmed maxmad" command (default maxabsmed = 3.0 sec, maxmad = 10.0 sec).

Pick-group assessment is a multiple pass process. The first pass samples from the earliest (up to 25) P-phase picks which meet the testing criteria; this pass is used to winnow the data to arrivals with median residuals below a cutoff value. A second pass resamples the winnowed data to determine the new starting hypocenter from the average of all the 4-pick hypocenters. This new starting hypocenter and the winnowed list of picks are handed to binder's locator for further processing.

If any picks were rejected as glitches or outliers by the pick-group assessment, they are deassociated from the event and are free to restack or reassociate. Occassionally, binder rejects enough picks that the event no longer has enough valid phases to relocate it. In this case, the event is killed, and all of its picks are deassociated.

Examples from Binder's Log File

  1. Stack: five picks stack to create a new event.
  1. Glitch Detection in Stacking Phase: a series of picks stack, until the last pick identifies the group as a glitch.
  1. Association: two examples: a pick associates with an active event and a pick fails to associate or stack.

Trouble-Shooting

Binder's not producing any earthquakes - what could be wrong?

  1. Is it seeing any picks?

Check the log file, you should see some lines like these:

 10  6  3 2189 MWB  NCVHZ D2  19990930005310.08     431     672     584
19990930_UTC_00:53:15 grid_stack, mhit = 0

The first line is a TYPE_PICK2K message, the second is the log of an stacking attempt.

  1. Is the velocity model specified with "lay depth velocity"?
  1. Is the station file in the correct format?

All lats & lons should be positive in degrees & minutes, with N,S,E,W denoted by letters. If there are no letters, N and W are assumed.

  1. Are the stacking grid boundaries set properly?

Lats & lons are in decimal degree with positive = N, E.

Configuration File Commands

Helpful Hints

Attachments