I am excited to see the GCP 2.0 being conducted and very interested in how the measurements and data are collected and how the coherence between the devices is calculated.
I was not able to find any publications on the methodology. Could you maybe reveal a bit more about that @nplonka ?
GCP mostly use decades old methods of signal processing.
How about to try transfer entropy, or LZ distance between generators?
Possibly, there should be API to try different data processing methods.
Also, Ulf Holmberg has shown, that GCP data could be correlated with financial market movements and Google trends. Which are significantly more defined metrics, than “significant events”.
@nplonka Hi, I was wondering if there are any plans for the GCP2.net data to be made public, either in realtime via CSV/API or in historical chunks at regular intervals?
@WanderingIshiki@Simurgh Great ideas, yes we are working towards making the data available. We have limited bandwidth so we greatly appreciate whenever someone wants to take a look at the data from another angle.
There are several areas of research that are open in GCP 2.0, but the first step would be to familiarize yourself with the kind of data that you are analyzing.
For the global network, here is a sample of the Network Coherence data, shared via google drive. All timestamps are in UTC. If you run a cumulative sum over the Network Coherence data and overlay the envelope, you should be able to reproduce plots similar to those shown on our website. To learn more about the event that this data is describing and how the plot should look, check out the “Meditation: At the Global Spirituality Mahotsav” event under Event Analysis on our website
Once you are able to reproduce this, let us know if you are interested in more global network data or something else.
Optional details: The above should be enough for you to get started, but if you have more questions, then read the following.
For math details regarding the Network Coherence data, check out the math in Bancel and Nelson’s Journal of Scientific Exploration 2008 paper about GCP 1, where this metric was originally known as Network Variance (or NetVar)
Instructions on generating the plot from the data: Given the Network Coherence, you only need to take a cumulative sum of it in the desired time region to get the red curve. Then overlay the blue envelope, which is the 95% confidence interval of chi squared for increasing degrees of freedom i.e. degrees of freedom at each point corresponds to the number of seconds on the time (x) axis.
This reminds me, I’ve got all of GCP v1.0’s raw EGG data from 1998 up until late 2023 (I think - it’s easy enough to update it with the scripts) in Google’s BigQuery (their data warehouse and analytics platform). I’ve always wanted to try applying a “rolling cumulative sum” over the whole dataset to look for anomalies holistically and not just for the time periods of chosen events.
Also, it would be good idea to check correlations of GCP data with downdetector.
Millions are being pissed off with lags. And there is good numbers of how much, where and when precisely people were pissed off. Even if there aren’t any feedback about how much precisely they were irritated. Supposedly, their flames beneath back should have some measurable significance to universe.
@WanderingIshiki Nice! If you are interested in 1.5 years of Network Coherence across a large span of when GCP 2.0 has been running, here it is shared via google drive. I would explain the columns at some point, but they key ones to start with would be time_epoch and netvar_count_xor_alt, which is the fully whitened Network Coherence as in your plot above, before taking a cumulative sum.
Also, remove the visually obvious spike outliers when there were some technical issues with the data. It is a complex project and we are still ironing out kinks. The following list should cover most or all dates containing bad data: “2023-06-01” “2023-06-02” “2023-06-03” “2023-06-04” “2023-06-05” “2023-06-06” “2023-06-07” “2023-06-08” “2023-06-09” “2023-09-28” “2023-09-29” “2023-11-27” “2024-02-06” “2024-02-27” “2024-02-28” “2024-02-29” “2024-03-01” “2024-03-02” “2024-03-03” “2024-03-04” “2024-03-05” “2024-03-06” “2024-03-07” “2024-03-08” “2024-03-09” “2024-03-10” “2024-03-11” “2024-03-12” “2024-03-13”
@Simurgh Sounds like a good idea, I think you have the data here to play with that, although first I recommend trying to reproduce the one event like @WanderingIshiki . We have definitely been trying some of those long-term correlations that don’t require us to choose an event.
I’ve fumbled around downdetector. Turns out, it’s not so simple. There aren’t availible historical data. And there is business API. Which is supposedly isn’t free.
I suggest someone to write them about acquiring of historical data. I see it not impossible, that they’ll get along with GCP. I, probably, shouldn’t write them, because I’m from RU.
Here is a link to the movie, The Joy of Sox (2013) https://www.youtube.com/watch?v=zfm2uMUZcKc. About 29 minutes into the video, they show and describe the system and data analysis I provided for making measurements of the collective influence on my MMI system at a Red Sox home game.
Find someone, who will let to use own access to downdetector historical data for our research.
Figure out scraper for online downdetector data. Which won’t provide us historical data. But will allow us to spot further correlations. Which will be tedious. it’s hard to scrape maps.
@nplonka I’m not sure if there’s a proper channel to report data outages, but I just wanted to let you know data from device d4d4da4a9493 (498) (in New Zealand) would’ve been offline a couple of times over around the 2024-11-09~10 dates as I was moving house.
There’s probably a tonne of APIs/historical timeseries & realtime data sets we could use. I want to work on a datasource-agnostic comparison tool that can take Down Detector or NASDAQ prices (or any index) and compare that against GCP 1 or 2 in a animated from-to-when movement.
This has been on my mind for years, but it’ll take time to get there.
I lack a solid stats background so am not sure what stat methods are good for comparisons (bar some of the few I’ve researched).