Log 3
Justin
Vandenbroucke
justinv@hep.stanford.edu
Stanford
University
9/25/02 –
10/16/02
This file contains log entries
summarizing the results of various small subprojects of the AUTEC study. Each entry begins with a date, a title, and
the names of any relevant programs (Labview .vi files or Matlab .m files – if
an extension is not given, they are assumed to be .m files).
9/25/02
AUTEC_programs_9/25/02
Wrote a new version of online code,
AUTEC_programs_9/25/02. This version
decouples of the threshold by channel, so each channel has a different adaptive
threshold. This obviates the need for
perfect correction of pressure levels for variation in phone gains. It also allows the lowest reasonable
threshold at each phone independently.
9/26/02
Analysis on erinyes
All data have been organized on
erinyes in /Data/AUTEC/AUTEC_data.
Matlab has also been installed and is running successfully.
9/27/02
Data overview
dataMatrix.m
mergeDataMatrix.m
plotDataMatrix.m
The plots below summarize the entire data set currently on disk
(through mid Jun 2002). More data for
Jun – Sep should be arriving shortly.
Of 183 days with 1 or more minutes of data taken, 94 have data taken for
all 1440 minutes of the day.
9/29/02
Test of AUTEC_programs_9/25/02
Received 5 minutes of data from
9/27/02 from the new online code. It
looks okay except for two things: there are no events in the 5 minutes (perhaps
the threshold is always slightly too high); the threshold array for the second
minute of data has zero entries.
10/02/02
Data set status and accessibility
All data are uncompressed on
erinyes.stanford.edu. All file formats
(1-5) are now readable with readMinuteData.m and readEventRecord.m. Several days’ folders have been renamed from
yyyy.mm.dd to yyyy.mm.dddev, because they were taken during development and the
file format changed in a complicated way during them. With more work they could be readable.
10/02/02
Size of data set
dataSize.m
The plots below summarize the size
of the data on disk, when uncompressed (typical data can be zipped to 15% of
its uncompressed size).
10/2/02
Event rates
plotNumEvtsLookup.m
The plots below summarize our event rates.
10/2/02
Log of postamp changes
Dan Belasco sent me(email
10/2/02) the entries from his log that
indicate changes in post amp’s for our 7 phones. It’s as follows (changed from their numbering to ours and ignored
phones we’re not using):
10/11/01 irregular readings of 2, 4, 7
10/17/01 disconnected 1, 3, 5, 7 for repair
12/18/01 3, 5, 7 changed
all
340’s phones working except 346 (1)
12/20/01 346 (1) changed and working
replaced
2, 4, cks. Good.
1/4/02 1 changed
6/7/02 1, 3 replaced
8/15/02 7 replaced
10/2/02
Wind speed data
plotDailyMeanNoiseEnergies.m
nassauCoords.m
NOAA has near-daily measurements of
mean wind speed at Nassau Int’l Airport available online (see
AUTEC_analysis/climate_data/mean_wind_speed/readme.txt). Each month has data for most days. Several
months are missing a few days at most.
I obtained data for 24 consecutive months, 9/00 – 8/02. The first plot below gives the daily values,
and the second plot gives the monthly mean of these values. There is wide variation in the daily
data. Rough annual structure is evident
in the monthly data. Nassau Int’l
Airport is located at a latitude of +2505 (25° 5 N), a
longitude of –07746 (77° 46 W), and
at 7 m above sea level. Our central
hydrophone is at 24° 24 28.8 N, 77° 32 36.7 W. In our coordinates, Nassau is 75 km North
and 23 km West of the central phone, or at a distance of 79 km 17 degrees West
of North.
10/2/02
Structure in noise energies
plotDailyMeanNoiseEnergies.m
The plots below give the noise
energies in all channels at each of 1440 minutes during a day. Periodic structure is evident.
10/4/02
New data shipment
Below are plots updated to include a
shipment of data received 10/3/02. We
now have data for 3e5 minutes, equivalent to 208 days of continuous
running. We have 2.5e7 events, giving
an overall rate of 80 events / minute during live minutes.
10/5/02
Distribution of run durations
plotLengthDist.m
The plot below gives the
distribution of run durations.
10/8/02
eventListComp.m
checkEventListComp.m
eventListComp.m, the program that
determines summary data for every event of every minute, has been run on all
data (took several days).
10/8/02
Filter frequency response
frequencyResponse.m
For every frequency from 1 to 60 kHz
in 1 kHz steps, a 56 ms sine wave was generated and sampled with the same
sampling frequency as online. It was
then run through our digital filter.
The plot below gives the ratio of the filtered RMS to the unfiltered RMS
for each frequency. This would be the
response if a pure sinusoid at each frequency hit the phone (with no noise
– the response function derivation assumed f-2 noise).
10/8/02
Spectrum
spectrumSTD.m
For comparison, the plot below gives
the power spectrum averaged over one hour.
Phone voltage levels are unrescaled.
10/9/02
Accuratre Vrms measurements on all
time scales
plotVrms.m
Starting with file format 4,
installed 9/20/02, we calculate Vrms of every 0.1 s (17,900 sample) block of
data that is read. This allows accurate
and precise determination of Vrms on more time scales (before we were limited
to a minutely measurement derived from the power spectrum). The three plots below give Vrms’s evolution
over different time scales: one minute, one hour, and several days. The one-hour example is somewhat unusually
smooth. Many are entirely flat
(constant Vrms values) and many have periods of chaotic fluctuation.
10/9/02
Vrms over entire data range; gains
calcMinutelyNoiseEnergies.m
calcGains.m
plotGains.m
A noise energy is calculated each
minute for each channel. Noise energy
is the power in all freqeuncies as determined from the minutely spectra,
determined from 0.1 s of data. Vrms can
be determined from noise energy, so a Vrms value is determined every
minute. The plot below gives the daily
mean of these minutely Vrms measurements.
These values are not corrected for gain. The solid line in April 02 is due to one channel being down
(channel 3).
10/10/02
Radar noise
plotGains.m
plotListSpectra.m
On 8/21/02, the radar at Site 3 was
removed (email from Dan Belasco, 10/9/02).
We would like to find the effect of this change on noise (Vrms; at
various frequencies; spike noise (50 kHz); correlated noise (60 kHz)). The first plot below gives unrescaled Vrms
values over time. There is no apparent
change around 8/21/02. Similarly, there
is no change in hourly mean powers (determined from minutely spectra) at 10, 20,
30, 40, 50, or 60 kHz. The second plot
below gives rates of each event code before and after the radar was
removed. Recall codes 0 and 1 designate
uncorrelated events; code 2 designates correlated electronic noise. The rate of correlated noise events does not
change with removal of the radar.
10/10/02
Offline rethresholding
keepBest60.m
plotDailyNumEvts.m
numBestEvtsLookupTable.m
numEvtsPerMinute.m
Due to the inadequate adaptive
thresholding algorithm that has been running until recently, some minutes have
a threshold much too low and are swamped by data (eg ~1000 events in one
minute, compared to the 60 events per minute target rate). keepBest60.m rethresholds offline, raising
the threshold in steps of 0.004. The
plot below gives the daily event rates before and after rethresholding. The total number of events is 26 million
(25.879407e6) and 11 million (10.760885e6) events (42% of events was
retained). On days that have events,
the average number of events per day is 106 thousand before rethresholding and
44 thousand after.
The first two small plots below give
the distribution of events per minute over the entire data set, before and
after offline rethresholding. The long
tail of the distribtion is cut by rethresholding. The second pair of small plots gives close-ups of the same
plots. It is unclear why there is a
peak at ~20 events per minute separate from that at 60 events per minute. Perhaps there is some interesting phenomenon
occurring with an average rate of every 3 s that accounts for the peak at 20;
and the peak at 60 is due to triggering on gaussian noise when there is nothing
interesting going on. The peak at ~0
events per minute is likely due to the long time taken for the threshold to reach
low values when the noise level drops suddenly.
10/11/02
Spike events
findPointDisplacements.m
spikeDistribution.m
findPointDisplacements has been run
on all data (a new version is running now which writes nan when events have
zero time series length; as of now they were ignored which messed up the count
for each hour). The algorithm is as follows: For each event let len
be the smaller of 179 and the length of the time series. Then let
Vabs be the absolute value of the first len samples of the
time series in the triggering channel.
Find the largest value (max1) and second-largest value (max2)
of Vabs. The spike rejection
criterion is then defined to be (max1 – max2) / max1. It ranges from 0 to 1, with larger values
for point displacements. The first plot
below gives the distribution of the criterion for all events, with a bin size
of 1e-3. The large number of events
with criterion 0 are likely due to events in which the first- and second-
highest amplitude samples are in the same ADC bin. There was a period early in running when the ADC dynamic range
was too small, which would have increased the rate of the two largest-amplitude
samples having equal amplitude. The
second plot below is a closeup showing that there are two populations,
corresponding to spike events (above 0.6) and non-spike events (below 0.6).
10/11/02
Spike events were correlated with
Site 3 radar
dailySpikeRate.m
The first plot below gives the total
number of events each day as well as the number of spike events each day (were
an event is considered a spike if its criterion is above 0.6). Overall, of 25 (25.877921) million events,
1.6 (1.622968) million are spike events, corresponding to an overall rate of
6%. The Site 3 radar was removed on
8/21/02. The second plot gives a close
up of June-August 2002 (tick marks on the abscissa indicate the first day of
each month). Clearly the rate of spike
events dropped significantly when the radar was removed.
10/11/02
Confirmation of 1.5 s rep rate for
spike events
spikeDtDistribution.m
Time differences were determined between
consecutive spike events (those with spike criterion > 0.6). The distribution of time differences is
shown in the two plots below (the second is a closeup). Spike events are seen to occur every ~1.5
s. More precisely, the 50th
peak occurs at ~76 seconds, giving a period of 76 s /50 = 1.52 s.
10/11/02
Periodic structure of all events
dtDistribution.m
The four plots below are the same
histogram with different windows. The
histogram gives the distribution of time differences between all events. The four plots show the following structure:
(1) Periods of both 17 ms (60 Hz) and 20 ms (50 Hz); (2) An unexplained bump at
~0.35 s (rep rate of some biological signal?); (3) 1.5 s – periodic spike
events; (4) dt’s enforced to be greater than 17 s by buffer overflows (length
of buffer is 17 s), as well as 1.5 s structure still evident.
10/11/02
Periodic structure of all events
after offline rethresholding
dtDistributionBest.m
Reconstructing the above plot only
considering the ~1/2 of events that remain after offline rethresholding, the
peaks are retained while the smooth (exponential?) background is reduced. The plot below shows the 17 and 20 ms
structure, which stands in greater relief against a lower background.
10/15/02
Monthly number of events
plotNumEvtsEachMonth.m
The plot below gives the number of
events recorded each month. The total
number of events is 25.883803 million.
10/15/02
Nikolai’s parameters
nikParams.m
fn.m
plotFn.m
fnBest.m
Nikolai’s parameters have been
calculated for almost all events. The
first plot below is Nikolai’s mountain plot for all events. The second plot gives the mountains after
offline rethresholding has occurred.
The third plot gives the mountains for those events that were removed y
offline rethresholding. It is puzzling
why the three plots are so similar.
10/16/02
Correlated events (electronic noise)
eventCorrelationsCross.m
plotCorrelationDistribution.m
plotCorrelatedEventsPerDay.m
(data are in
mat_files/eventCorrelationsCross; check this if you run any of the above codes)
Both online and offline, we have
been removing cross-talk before calculating the correlation in order to reject
60 Hz electronic noie correlated events.
However, it appears that we can distinguish the noise from good events
without removing cross-talk. The first
plot below gives the correlation of every event until the December 2001 upgrade
(during which the second-level trigger to reject correlated events was
installed), calculated without first removing cross-talk. There is one peak centered at 200 and
another at 4000. They can be separated
at roughly 1600. The second plot gives
the same distribution for the first 10 days of Run II (after the December
upgrade). Only events of code 1 or 2
were considered (those that passed both triggers). The third plot gives the number of total number of events and the
number of events with this correlation above 1600 each day.
10/16/02
Nikolai parameters for spike events
findPointDisplacements.m
fnSpike.m
plotFn.m
The plots below gives Nikolai’s parameters for spike events (defined to
be those with criterion > 0.6).
There is a distinct difference between Run I (before December 2001) and
Run II (after December 2001). Plotting
one distribution for each month confirms that there is a change at December
2001. I believe I might have changed
the ADC binning in December, which would have changed the distribution of the
spike criterion and therefore change how we determine spikes.
10/16/02
Nikolai parameters for correlated
events
fnCorrelated.m
plotFn.m
The plot below gives Nikolai’s
parameters for correlated events (events of code 2 – those that were rejected
by level 2 trigger online) during Dec 2001 and Jan 2002. There was a small bug in calculating the
correlation online that has since been fixed.
The distribution my be more strongly localized with the bug fixed.
10/16/02
Distribution of peak pressure
plotPmaxDistribution.m
Each of the three plots below gives
the distribution of peak pressures of events for one phone (the dominant one,
4) for one day. There appears to be one
peak corresponding to events near the gaussian noise and one for events above
the gaussian noise (?) Many days have
two secondary peaks, as in the second plot.
The value of the pressure is only good to within a factor of 2 due to
uncertainty in the phone gains. For
comparision, Nikolai’s paper predicts peak pressures of 0.5 Pa and 0.1 Pa,
respectively, for hadronic and electromagnetic showers at ideal angles 1 km
from the shower.
10/16/02
Nikolai’s parameters for coincidence
candidates
fnCoincidence.m
plotFn.m
The first plot below gives Nikolai’s
parameters for all events from ~ Dec 22 2001 – Jan 22 2002. The second plot gives parameters for
single-phone coincidence candidates during the same period. Here a coincidence candidate is a
single-phone event that is part of at least one quadruplet that passes
coincidence windowing and has one or two hyperboloid intersection in the
fiducial cylinder. Note that enforcing
coincidence reduces the single-phone event count by a factor of 500. So if we have 25 million events total, we
expect only 50 000 coincidence events, or 200 events per day on average. However there is still a difficult combinatorics
problem: each coincidence candidate is a member of perhaps 10 different
quadruplets. So the rate of quadruplet
coincidence candidates is only a factor of ~60 less than the single-phone rate.