IE6400_Day21
html
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
6400
Subject
Industrial Engineering
Date
Feb 20, 2024
Type
html
Pages
53
Uploaded by ColonelStraw13148
IE6400 Foundations of Data Analytics Engineering
¶
Fall 2023
¶
Module 4: Time Series Analysis Part - 2
¶
Feature Extraction in Time Series Analysis
¶
We embark on an exciting journey into time series analysis, exploring how to utilize some feature extraction techniques. We've previously delved into basic methods to understand traditional approaches, transforming time series data into many forms. These include recurrence plots and intricate networks like the Recurrence Network, Natural Visibility Network, and Horizontal Visibility Network.
This session will focus on learning these transformations to uncover hidden patterns and structures in time series data, which are not immediately apparent in their raw sequential form. We'll explore the recurrence quantification analysis (RQA) and various
network measures essential tools. RQA allows us to quantify patterns in time series statistically, while network measures help us understand the structural properties of time series when viewed as networks. These methods are not just analytical tools but powerful feature extraction techniques that provide rich descriptors of the underlying time series data.
Application in Machine Learning
¶
The core of our study will be understanding how to harness these extracted features for machine learning applications. Whether it's classification, regression, or clustering tasks, these descriptive features can significantly enhance the capabilities of machine learning models in analyzing time series data. This combination of time series analysis and machine learning opens new avenues for deeper insights and potentially more accurate predictions. So, let's dive in and discover how we can innovatively apply these techniques to our time series data for groundbreaking results.
Network Science
¶
Network science is an interdisciplinary field that studies complex networks, which are systems of interconnected elements. These networks can be found in various domains,
from biological systems to social interactions, technological infrastructures, and more. The primary goal of network science is to understand the structure, dynamics, function, and evolution of networks.
Key Points
¶
1.
Complex Networks
: Unlike regular networks (like a lattice or a ring), complex networks have non-trivial topological features, such as a scale-free degree distribution, high clustering, and small-world properties.
2.
Nodes and Edges
: In the language of network science, individual entities are referred to as "nodes" (or vertices), and the connections between them are called "edges" (or links).
3.
Metrics and Measures
: Network science employs various metrics to understand networks, such as:
•
Degree
: The number of connections a node has. •
Path Length
: The shortest distance between two nodes. •
Clustering Coefficient
: Measures the degree to which nodes cluster together. •
Centrality
: Identifies the most important nodes in a network. •
Modularity
: Measures the strength of division of a network into modules or communities. 4.
Types of Networks
:
•
Scale-Free Networks
: Networks where some nodes have many more connections than others, following a power-law distribution. •
Small-World Networks
: Networks characterized by short path lengths between nodes and high clustering. •
Random Networks
: Networks where connections between nodes are made randomly. 5.
Applications
: Network science has applications in various fields:
•
Biology
: Studying protein-protein interaction networks, neural networks, and ecological networks. •
Sociology
: Analyzing social networks to understand patterns of human interactions. •
Technology
: Understanding the internet's structure, power grids, and transportation networks. •
Economics
: Analyzing trade networks, financial networks, etc. 6.
Dynamics and Processes
: Beyond static properties, network science also studies dynamic processes on networks, such as diffusion, spreading, synchronization, and cascading failures.
7.
Interdisciplinary Nature
: Network science draws on theories and methods from physics, mathematics, biology, social science, computer science, and other disciplines.
8.
Tools and Software
: Various software tools, like Gephi, NetworkX, and Cytoscape, have been developed to visualize and analyze networks.
In essence, network science provides a framework to analyze and understand the intricate web of connections in various systems, revealing insights about their structure, function, and underlying principles.
NetworkX Library
¶
NetworkX is a Python package designed for the creation, manipulation, and study of complex networks of nodes and edges. It provides tools to work with both large and small datasets, and its primary goal is to enable research in the field of network science.
Key Features
¶
1.
Data Structures
: NetworkX provides data structures for representing various types of networks, including:
•
Undirected networks •
Directed networks •
Multi-graphs (networks with multiple edges between nodes) •
Hypergraphs 2.
Network Analysis
: The library offers a wide range of algorithms for:
•
Shortest path computations •
Network traversal •
Centrality measures •
Clustering and community detection •
Network flow problems 3.
Visualization
: While NetworkX is not primarily a graph drawing tool, it provides basic visualization capabilities using Matplotlib. For more advanced visualization,
it can integrate with tools like Graphviz.
4.
Flexibility
: Nodes can be any hashable object (e.g., text, images, XML records), and edges can contain arbitrary data.
5.
Interoperability
: NetworkX can read and write various graph formats, allowing for easy data exchange with other graph libraries or software.
6.
Extensibility
: The library is designed to be easily extensible, allowing users to implement custom graph algorithms, drawing tools, and more.
Installing NetworkX Library
¶
To install NetworkX, follow these steps:
¶
Step 1: Ensure You Have Python and pip Installed
Before installing NetworkX, you should have Python and pip (Python package installer) installed on your system.
Step 2: Install NetworkX
Once you have Python and pip ready, you can install NetworkX using pip.
Step 3: Verify the Installation
After the installation is complete, you can verify that NetworkX has been installed correctly by importing it in a Python environment.
In [1]:
# Step 2: Install NetworkX using pip
#!pip install networkx
In [2]:
# Step 3: Verify the installation
import networkx as nx
print("Network X version: ", nx.__version__)
Network X version: 2.5
Exercise 1 Building a Network Graph using NetworkX
¶
Problem Statement:
¶
Using the Zachary's Karate Club dataset, a well-known social network of friendships between 34 members of a karate club at a US university in the 1970s, create a network graph to visualize the relationships. Identify the most influential members of the club based on degree centrality.
Steps:
¶
1.
Import Necessary Libraries:
Start by importing the required Python libraries.
2.
Load the Dataset:
NetworkX provides the Karate Club dataset, so you can easily load it.
3.
Visualize the Network:
Use NetworkX and Matplotlib to visualize the network graph.
4.
Calculate Degree Centrality:
Identify the most influential members based on degree centrality.
In [3]:
# Step 1: Import Necessary Libraries
import networkx as nx
import matplotlib.pyplot as plt
# Step 2: Load the Dataset
G = nx.karate_club_graph()
# Step 3: Visualize the Network
plt.figure(figsize=(10, 8))
nx.draw(G, with_labels=True, node_color='skyblue', node_size=1500, edge_color='gray')
plt.title("Zachary's Karate Club Network")
plt.show()
# Step 4: Calculate Degree Centrality
degree_centrality = nx.degree_centrality(G)
sorted_degree_centrality = sorted(degree_centrality.items(), key=lambda x: x[1], reverse=True)
print("Top 5 nodes by degree centrality:")
for node, centrality in sorted_degree_centrality[:5]:
print(f"Node {node}: {centrality:.2f}")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Top 5 nodes by degree centrality:
Node 33: 0.52
Node 0: 0.48
Node 32: 0.36
Node 2: 0.30
Node 1: 0.27
Explanation:
¶
Visualization:
¶
The network graph displays the 34 members of the karate club and their relationships.
Each node represents a member, and each edge represents a relationship between two members.
Interpretation:
¶
The nodes with the highest degree centrality are the most influential members in the club, meaning they have the most direct connections to other members. In the context
of the Karate Club, these members are likely to play a central role in the social dynamics of the club.
Exercise 2 Building and Modifying a Network Graph using NetworkX
¶
Problem Statement:
¶
Using the Florentine Families dataset, which represents the relationships (marriages and business ties) between 15th-century Florentine families, create a network graph to
visualize these relationships. After visualizing the initial dataset, add a new family node and establish connections with existing families.
Steps:
¶
1.
Import Necessary Libraries:
Begin by importing the required Python libraries.
2.
Load the Dataset:
NetworkX provides the Florentine Families dataset, making it easy to load.
3.
Visualize the Initial Network:
Use NetworkX and Matplotlib to visualize the network graph.
4.
Add a New Node and Edges:
Introduce a new family to the dataset and establish connections with two existing families.
5.
Visualize the Updated Network:
Display the network graph after adding the new family.
In [4]:
# Step 1: Import Necessary Libraries
import networkx as nx
import matplotlib.pyplot as plt
# Step 2: Load the Dataset
G = nx.florentine_families_graph()
# Step 3: Visualize the Initial Network
plt.figure(figsize=(10, 8))
nx.draw(G, with_labels=True, node_color='lightgreen', node_size=1500, edge_color='gray')
plt.title("Florentine Families Network (Initial)")
plt.show()
In [5]:
# Step 4: Add a New Node and Edges
G.add_node("NewFamily")
G.add_edge("NewFamily", "Medici")
G.add_edge("NewFamily", "Strozzi")
# Step 5: Visualize the Updated Network
plt.figure(figsize=(10, 8))
nx.draw(G, with_labels=True, node_color='lightgreen', node_size=1500,
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
edge_color='gray')
plt.title("Florentine Families Network (Updated)")
plt.show()
Explanation:
¶
•
Visualization (Initial):
The initial network graph displays the relationships between the Florentine families. Each node represents a family, and each edge represents a relationship (either through marriage or business ties).
•
Visualization (Updated):
After adding the "NewFamily" node and its
connections, the updated graph showcases the new family and its ties to the "Medici" and "Strozzi" families.
•
Interpretation:
By introducing a new node and establishing connections, we can observe how the new family integrates into the existing social structure. In this scenario, the "NewFamily" has established ties with two influential families, suggesting a strategic alliance or partnership.
Exercise 3 Understanding Network Measures using NetworkX
¶
Problem Statement:
¶
Using the Les Misérables dataset, which represents the coappearance network of characters in Victor Hugo's novel "Les Misérables", calculate and interpret various network measures to understand the structure and importance of characters in the novel.
Step 1: Import Necessary Libraries
¶
Before diving into the analysis, we need to import the required Python libraries.
In [6]:
# Step 1: Import Necessary Libraries
import networkx as nx
import matplotlib.pyplot as plt
Step 2: Load the Dataset
¶
NetworkX provides the Les Misérables dataset, which we can load directly. This dataset is a weighted graph where nodes represent characters, and edges represent the number of coappearances in the novel.
In [7]:
# Step 2: Load the Dataset
G = nx.les_miserables_graph()
Step 3: Visualize the Network
¶
Visualizing the network will give us a graphical representation of the characters and their relationships.
In [8]:
# Step 3: Visualize the Network
plt.figure(figsize=(12, 10))
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True, node_color='skyblue', node_size=1500, edge_color='gray')
plt.title("Les Misérables Character Network")
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Calculate Network Measures
¶
We'll calculate the following network measures:
•
Degree Centrality:
This measure identifies the most connected characters. •
Betweenness Centrality:
This measure identifies characters that act as bridges in the network. •
Closeness Centrality:
This measure identifies characters that are central in terms of information flow. In [9]:
# Step 4: Calculate Network Measures
# Degree Centrality
degree_centrality = nx.degree_centrality(G)
print("Degree Centrality:")
for node, centrality in degree_centrality.items():
print(f"Node {node}: {centrality}")
# Betweenness Centrality
betweenness_centrality = nx.betweenness_centrality(G)
print("\nBetweenness Centrality:")
for node, centrality in betweenness_centrality.items():
print(f"Node {node}: {centrality}")
# Closeness Centrality
closeness_centrality = nx.closeness_centrality(G)
print("\nCloseness Centrality:")
for node, centrality in closeness_centrality.items():
print(f"Node {node}: {centrality}")
Degree Centrality:
Node Napoleon: 0.013157894736842105
Node Myriel: 0.13157894736842105
Node MlleBaptistine: 0.039473684210526314
Node MmeMagloire: 0.039473684210526314
Node CountessDeLo: 0.013157894736842105
Node Geborand: 0.013157894736842105
Node Champtercier: 0.013157894736842105
Node Cravatte: 0.013157894736842105
Node Count: 0.013157894736842105
Node OldMan: 0.013157894736842105
Node Valjean: 0.47368421052631576
Node Labarre: 0.013157894736842105
Node Marguerite: 0.02631578947368421
Node MmeDeR: 0.013157894736842105
Node Isabeau: 0.013157894736842105
Node Gervais: 0.013157894736842105
Node Listolier: 0.09210526315789473
Node Tholomyes: 0.11842105263157894
Node Fameuil: 0.09210526315789473
Node Blacheville: 0.09210526315789473
Node Favourite: 0.09210526315789473
Node Dahlia: 0.09210526315789473
Node Zephine: 0.09210526315789473
Node Fantine: 0.19736842105263158
Node MmeThenardier: 0.14473684210526316
Node Thenardier: 0.21052631578947367
Node Cosette: 0.14473684210526316
Node Javert: 0.22368421052631576
Node Fauchelevent: 0.05263157894736842
Node Bamatabois: 0.10526315789473684
Node Perpetue: 0.02631578947368421
Node Simplice: 0.05263157894736842
Node Scaufflaire: 0.013157894736842105
Node Woman1: 0.02631578947368421
Node Judge: 0.07894736842105263
Node Champmathieu: 0.07894736842105263
Node Brevet: 0.07894736842105263
Node Chenildieu: 0.07894736842105263
Node Cochepaille: 0.07894736842105263
Node Pontmercy: 0.039473684210526314
Node Boulatruelle: 0.013157894736842105
Node Eponine: 0.14473684210526316
Node Anzelma: 0.039473684210526314
Node Woman2: 0.039473684210526314
Node MotherInnocent: 0.02631578947368421
Node Gribier: 0.013157894736842105
Node MmeBurgon: 0.02631578947368421
Node Jondrette: 0.013157894736842105
Node Gavroche: 0.2894736842105263
Node Gillenormand: 0.09210526315789473
Node Magnon: 0.02631578947368421
Node MlleGillenormand: 0.09210526315789473
Node MmePontmercy: 0.02631578947368421
Node MlleVaubois: 0.013157894736842105
Node LtGillenormand: 0.05263157894736842
Node Marius: 0.25
Node BaronessT: 0.02631578947368421
Node Mabeuf: 0.14473684210526316
Node Enjolras: 0.19736842105263158
Node Combeferre: 0.14473684210526316
Node Prouvaire: 0.11842105263157894
Node Feuilly: 0.14473684210526316
Node Courfeyrac: 0.17105263157894735
Node Bahorel: 0.15789473684210525
Node Bossuet: 0.17105263157894735
Node Joly: 0.15789473684210525
Node Grantaire: 0.13157894736842105
Node MotherPlutarch: 0.013157894736842105
Node Gueulemer: 0.13157894736842105
Node Babet: 0.13157894736842105
Node Claquesous: 0.13157894736842105
Node Montparnasse: 0.11842105263157894
Node Toussaint: 0.039473684210526314
Node Child1: 0.02631578947368421
Node Child2: 0.02631578947368421
Node Brujon: 0.09210526315789473
Node MmeHucheloup: 0.09210526315789473
Betweenness Centrality:
Node Napoleon: 0.0
Node Myriel: 0.17684210526315788
Node MlleBaptistine: 0.0
Node MmeMagloire: 0.0
Node CountessDeLo: 0.0
Node Geborand: 0.0
Node Champtercier: 0.0
Node Cravatte: 0.0
Node Count: 0.0
Node OldMan: 0.0
Node Valjean: 0.5699890527836184
Node Labarre: 0.0
Node Marguerite: 0.0
Node MmeDeR: 0.0
Node Isabeau: 0.0
Node Gervais: 0.0
Node Listolier: 0.0
Node Tholomyes: 0.04062934817733579
Node Fameuil: 0.0
Node Blacheville: 0.0
Node Favourite: 0.0
Node Dahlia: 0.0
Node Zephine: 0.0
Node Fantine: 0.12964454098819422
Node MmeThenardier: 0.02900241873046176
Node Thenardier: 0.07490122123424225
Node Cosette: 0.023796253454148188
Node Javert: 0.05433155966478436
Node Fauchelevent: 0.026491228070175437
Node Bamatabois: 0.008040935672514621
Node Perpetue: 0.0
Node Simplice: 0.008640295033483888
Node Scaufflaire: 0.0
Node Woman1: 0.0
Node Judge: 0.0
Node Champmathieu: 0.0
Node Brevet: 0.0
Node Chenildieu: 0.0
Node Cochepaille: 0.0
Node Pontmercy: 0.006925438596491228
Node Boulatruelle: 0.0
Node Eponine: 0.011487550654163002
Node Anzelma: 0.0
Node Woman2: 0.0
Node MotherInnocent: 0.0
Node Gribier: 0.0
Node MmeBurgon: 0.02631578947368421
Node Jondrette: 0.0
Node Gavroche: 0.16511250242584766
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Node Gillenormand: 0.02021062158319776
Node Magnon: 0.00021720969089390142
Node MlleGillenormand: 0.047598927875243675
Node MmePontmercy: 0.0003508771929824561
Node MlleVaubois: 0.0
Node LtGillenormand: 0.0
Node Marius: 0.132032488621946
Node BaronessT: 0.0
Node Mabeuf: 0.027661236424394314
Node Enjolras: 0.0425533568221771
Node Combeferre: 0.0012501455659350393
Node Prouvaire: 0.0
Node Feuilly: 0.0012501455659350393
Node Courfeyrac: 0.00526702988198833
Node Bahorel: 0.0021854883087570067
Node Bossuet: 0.03075365017995782
Node Joly: 0.0021854883087570067
Node Grantaire: 0.00015037593984962405
Node MotherPlutarch: 0.0
Node Gueulemer: 0.004960383978389518
Node Babet: 0.004960383978389518
Node Claquesous: 0.00486180419559921
Node Montparnasse: 0.0038738298738298727
Node Toussaint: 0.0
Node Child1: 0.0
Node Child2: 0.0
Node Brujon: 0.00043859649122807013
Node MmeHucheloup: 0.0
Closeness Centrality:
Node Napoleon: 0.30158730158730157
Node Myriel: 0.4293785310734463
Node MlleBaptistine: 0.41304347826086957
Node MmeMagloire: 0.41304347826086957
Node CountessDeLo: 0.30158730158730157
Node Geborand: 0.30158730158730157
Node Champtercier: 0.30158730158730157
Node Cravatte: 0.30158730158730157
Node Count: 0.30158730158730157
Node OldMan: 0.30158730158730157
Node Valjean: 0.6440677966101694
Node Labarre: 0.39378238341968913
Node Marguerite: 0.41304347826086957
Node MmeDeR: 0.39378238341968913
Node Isabeau: 0.39378238341968913
Node Gervais: 0.39378238341968913
Node Listolier: 0.34080717488789236
Node Tholomyes: 0.3917525773195876
Node Fameuil: 0.34080717488789236
Node Blacheville: 0.34080717488789236
Node Favourite: 0.34080717488789236
Node Dahlia: 0.34080717488789236
Node Zephine: 0.34080717488789236
Node Fantine: 0.46060606060606063
Node MmeThenardier: 0.46060606060606063
Node Thenardier: 0.5170068027210885
Node Cosette: 0.4779874213836478
Node Javert: 0.5170068027210885
Node Fauchelevent: 0.4021164021164021
Node Bamatabois: 0.42696629213483145
Node Perpetue: 0.3179916317991632
Node Simplice: 0.4175824175824176
Node Scaufflaire: 0.39378238341968913
Node Woman1: 0.3958333333333333
Node Judge: 0.40425531914893614
Node Champmathieu: 0.40425531914893614
Node Brevet: 0.40425531914893614
Node Chenildieu: 0.40425531914893614
Node Cochepaille: 0.40425531914893614
Node Pontmercy: 0.37254901960784315
Node Boulatruelle: 0.34234234234234234
Node Eponine: 0.3958333333333333
Node Anzelma: 0.35185185185185186
Node Woman2: 0.4021164021164021
Node MotherInnocent: 0.39790575916230364
Node Gribier: 0.2878787878787879
Node MmeBurgon: 0.3438914027149321
Node Jondrette: 0.25675675675675674
Node Gavroche: 0.5135135135135135
Node Gillenormand: 0.4418604651162791
Node Magnon: 0.33480176211453744
Node MlleGillenormand: 0.4418604651162791
Node MmePontmercy: 0.3153526970954357
Node MlleVaubois: 0.3076923076923077
Node LtGillenormand: 0.36538461538461536
Node Marius: 0.5314685314685315
Node BaronessT: 0.35185185185185186
Node Mabeuf: 0.3958333333333333
Node Enjolras: 0.4810126582278481
Node Combeferre: 0.3917525773195876
Node Prouvaire: 0.3568075117370892
Node Feuilly: 0.3917525773195876
Node Courfeyrac: 0.4
Node Bahorel: 0.39378238341968913
Node Bossuet: 0.475
Node Joly: 0.39378238341968913
Node Grantaire: 0.3584905660377358
Node MotherPlutarch: 0.2846441947565543
Node Gueulemer: 0.4634146341463415
Node Babet: 0.4634146341463415
Node Claquesous: 0.4523809523809524
Node Montparnasse: 0.4578313253012048
Node Toussaint: 0.4021164021164021
Node Child1: 0.34234234234234234
Node Child2: 0.34234234234234234
Node Brujon: 0.38
Node MmeHucheloup: 0.35348837209302325
Step 5: Interpretation of Results
¶
•
Degree Centrality:
Characters with high degree centrality are the most connected, implying they interact with many other characters in the novel. •
Betweenness Centrality:
Characters with high betweenness centrality act as bridges or intermediaries between other characters, suggesting they play a crucial role in the storyline. •
Closeness Centrality:
Characters with high closeness centrality can quickly interact with all other characters, indicating their central role in the narrative. Let's identify the top 5 characters for each measure.
In [10]:
# Step 5: Interpretation of Results
# Top 5 characters by Degree Centrality
sorted_degree = sorted(degree_centrality.items(), key=lambda x: x[1], reverse=True)
print("Top 5 characters by Degree Centrality:")
for char, value in sorted_degree[:5]:
print(f"{char}: {value:.2f}")
# Top 5 characters by Betweenness Centrality
sorted_betweenness = sorted(betweenness_centrality.items(), key=lambda x: x[1], reverse=True)
print("\nTop 5 characters by Betweenness Centrality:")
for char, value in sorted_betweenness[:5]:
print(f"{char}: {value:.2f}")
# Top 5 characters by Closeness Centrality
sorted_closeness = sorted(closeness_centrality.items(), key=lambda x: x[1], reverse=True)
print("\nTop 5 characters by Closeness Centrality:")
for char, value in sorted_closeness[:5]:
print(f"{char}: {value:.2f}")
Top 5 characters by Degree Centrality:
Valjean: 0.47
Gavroche: 0.29
Marius: 0.25
Javert: 0.22
Thenardier: 0.21
Top 5 characters by Betweenness Centrality:
Valjean: 0.57
Myriel: 0.18
Gavroche: 0.17
Marius: 0.13
Fantine: 0.13
Top 5 characters by Closeness Centrality:
Valjean: 0.64
Marius: 0.53
Thenardier: 0.52
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Javert: 0.52
Gavroche: 0.51
Conclusion:
¶
By analyzing the network measures, we can gain insights into the relationships and importance of characters in "Les Misérables". Characters with high centrality values play significant roles in the narrative, either due to their interactions with many characters or their bridging role in the storyline.
Visibility Graph Network
¶
A visibility graph is a method used to transform time series data into a complex network. The primary objective is to capture the underlying patterns and structures of the time series in the form of a graph, allowing for the application of graph-theoretical methods to analyze the time series.
Concept
¶
The idea behind the visibility graph is to map a time series into a graph where:
•
Each data point in the time series becomes a node in the graph. •
Two nodes (or data points) are connected by an edge if, and only if, they can "see" each other. The criterion for "visibility" between two data points is defined geometrically. If a straight line can be drawn between two data points without intersecting the time series curve at any other point, then those two data points are said to be visible to each other, and hence, an edge is drawn between them.
Types of Visibility Graphs
¶
There are mainly two types of visibility graphs:
1.
Natural Visibility Graph (NVG)
: In this type, two data points ( $(n, a_n)$ ) and
( $(m, a_m)$ ) are connected if every data point ( k ) between ( n ) and ( m ) satisfies the condition:
$a_k < a_n + (a_m - a_n) \frac{m-k}{m-n}$ and $a_k < a_m + (a_n - a_m) \frac{k-n}{m-n}$
2.
Horizontal Visibility Graph (HVG)
: In this type, two data points ( $(n, a_n)$ ) and ( $(m, a_m)$ ) are connected if every data point ( k ) between ( n ) and ( m ) satisfies the condition:
$a_k < \min(a_n, a_m)$
Applications
¶
Visibility graphs have found applications in various domains:
•
Physics
: To analyze non-linear time series data from physical systems. •
Finance
: To study stock market data and understand market dynamics. •
Biology
: To analyze sequences and patterns in biological data. •
Climate Science
: To study temperature and other climatic time series. Advantages
¶
•
Universality
: Visibility algorithms can be applied to any kind of time series data.
•
Simplicity
: The method is geometrically intuitive and easy to implement. •
Efficiency
: Allows for the application of graph-theoretical methods to time series
analysis. Conclusion
¶
Visibility graphs provide a novel way to study time series data by transforming it into a
network. This transformation reveals patterns and structures in the data that might not
be immediately evident from the time series alone.
Installing ts2vg Library
¶
The ts2vg library offers high-performance algorithm implementations to build visibility graphs from time series data. Here's how you can install it:
In [11]:
#!pip install ts2vg
igraph
is a library collection for creating and manipulating graphs and analyzing networks.
In [12]:
#!pip install igraph
Exercise 4 Building a Natural Visibility Graph
¶
Problem Statement:
¶
Given a generated time series dataset, your task is to transform this dataset into a Natural Visibility Graph (NVG) using the ts2vg library. Once the NVG is constructed, visualize it using the NetworkX library to understand the underlying patterns in the time series.
Step 1: Import Necessary Libraries
¶
Before diving into the analysis, we need to import the required Python libraries.
In [13]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from ts2vg import NaturalVG
Step 2: Generate the Time Series Dataset
¶
For this exercise, we'll generate a simple sine wave as our time series data.
In [14]:
# Step 2: Generate the Time Series Dataset
ts = [1.0, 0.5, 0.3, 0.7, 1.0, 0.5, 0.3, 0.8]
# Plot the generated time series
plt.figure(figsize=(10, 5))
plt.plot(ts)
plt.title("Generated Time Series (Sine Wave)")
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Step 3: Construct the Natural Visibility Graph (NVG)
¶
Using the ts2vg library, we'll transform our time series data into a Natural Visibility Graph.
In [15]:
# Now, let's call NaturalVG package and assign it as g
g = NaturalVG()
# Now, using function build() we build network on our time series data
g.build(ts)
# as_igraph() function converts to graph object
ig_g = g.as_igraph()
# By printing ig_g we can observe our network connections
print(ig_g)
IGRAPH UN-- 8 15 --
+ attr: name (v)
+ edges (vertex names):
0 -- 1, 2, 3, 4 3 -- 0, 1, 2, 4 6 -- 4, 5, 7
1 -- 0, 2, 3, 4 4 -- 0, 1, 3, 5, 6, 7 7 -- 4, 5, 6
2 -- 0, 1, 3 5 -- 4, 6, 7
Now we can observe number of nodes, links, average degree, network diameter and path length.
In [16]:
print('Number of Nodes:',ig_g.vcount())
print('Number of Links:',ig_g.ecount())
print('Average Degree:',np.mean(ig_g.degree()))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
print('Network Diameter:',ig_g.diameter())
print('Average Path Length:',ig_g.average_path_length())
Number of Nodes: 8
Number of Links: 15
Average Degree: 3.75
Network Diameter: 3
Average Path Length: 1.5714285714285714
Step 4: Visualize the NVG
¶
Now that we have constructed the NVG, we'll visualize it using the NetworkX library.
In [17]:
# Step 4: Visualize the NVG
plt.figure(figsize=(10, 8))
nx_g = g.as_networkx()
nx.draw_kamada_kawai(nx_g)
plt.title("Natural Visibility Graph of the Time Series")
plt.show()
Interpretation of Results
¶
Visualization:
The NVG provides a unique perspective on the time series data. Each node in the graph corresponds to a data point in the time series, and edges are drawn between nodes that "see" each other based on the visibility criterion.
Interpretation:
Peaks and troughs in the time series can be identified as nodes with higher degrees in the NVG. The structure of the NVG can provide insights into the underlying patterns and periodicities in the time series.
Conclusion:
¶
The Natural Visibility Graph offers a novel way to study time series data by transforming it into a network. This transformation can reveal patterns and structures in the data that might not be immediately evident from the time series alone.
Exercise 5 Building a Natural Visibility Graph from Apple Stock's Closing Price
¶
Problem Statement:
¶
Given the historical closing prices of Apple Stock, your task is to transform this dataset
into a Natural Visibility Graph (NVG) using the ts2vg library. Once the NVG is constructed, visualize it using the NetworkX library to understand the underlying patterns in the stock prices.
Step 1: Import Necessary Libraries
¶
To begin, we need to import the required Python libraries and fetch the dataset.
In [18]:
# Step 1: Import Necessary Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import networkx as nx
from ts2vg import NaturalVG
Step 2: Load the Apple Stock Dataset
¶
For this exercise, we'll use the historical closing prices of Apple Stock
In [19]:
df_a = pd.read_csv('aapl.csv')
# Now, let's plot our dataset to observe it
df_a.plot('Date', 'Close', title='Time Series Line Graph', figsize=(10,8))
plt.title("Apple Stock's Closing Price")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
We will look at only a month of data because processing the whole dataset will take some time.
In [20]:
df_a = df_a.head(30)
# Now, let's plot!
df_a.plot('Date', 'Close', title='Time Series Line Graph', figsize=(10,8))
plt.title("Apple Stock's Closing Price")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 3: Construct the Natural Visibility Graph (NVG)
¶
With the ts2vg library, we'll transform our closing prices data into a Natural Visibility Graph.
In [21]:
# Step 3: Construct the Natural Visibility Graph (NVG)
# Create a NaturalVG (Variable Grid) graph object.
g = NaturalVG()
# Build the graph using data from the 'Close' column of the DataFrame 'df_a'.
g.build(df_a.Close)
# Retrieve the edges of the graph.
edges = g.edges
# Convert the NaturalVG graph to an igraph object.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
ig_g = g.as_igraph()
# Print the igraph object, which represents the graph.
print(ig_g)
IGRAPH UN-- 30 102 --
+ attr: name (v)
+ edges (vertex names):
0 -- 1, 4, 5, 7, 8, 10, 11, 12, 13, 14
1 -- 0, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 14
2 -- 1, 3, 5, 8, 10, 12, 13, 14
3 -- 1, 2, 4, 5, 10, 12, 13, 14
4 -- 0, 1, 3, 5
5 -- 0, 1, 2, 3, 4, 6, 7, 8, 10, 12, 13, 14
6 -- 5, 7
7 -- 0, 1, 5, 6, 8, 13, 14
8 -- 0, 1, 2, 5, 7, 9, 10, 13, 14
9 -- 8, 10
10 -- 0, 1, 2, 3, 5, 8, 9, 11, 12, 13, 14
11 -- 0, 1, 10, 12, 13, 14
12 -- 0, 1, 2, 3, 5, 10, 11, 13, 14
13 -- 0, 1, 2, 3, 5, 7, 8, 10, 11, 12, 14
14 -- 0, 1, 2, 3, 5, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18, 23, 24, 26, 27, 28
15 -- 14, 16, 17, 18
16 -- 14, 15, 17, 18
17 -- 14, 15, 16, 18
18 -- 14, 15, 16, 17, 19, 20, 21, 23, 27, 28
19 -- 18, 20, 21
20 -- 18, 19, 21
21 -- 18, 19, 20, 22, 23, 28
22 -- 21, 23
23 -- 14, 18, 21, 22, 24, 26, 27, 28
24 -- 14, 23, 25, 26, 27, 28
25 -- 24, 26
26 -- 14, 23, 24, 25, 27, 28
27 -- 14, 18, 23, 24, 26, 28
28 -- 14, 18, 21, 23, 24, 26, 27, 29
29 -- 28
In [22]:
print('Number of Nodes:',ig_g.vcount())
print('Number of Links:',ig_g.ecount())
print('Average Degree:',np.mean(ig_g.degree()))
print('Network Diameter:',ig_g.diameter())
print('Average Path Length:',ig_g.average_path_length())
Number of Nodes: 30
Number of Links: 102
Average Degree: 6.8
Network Diameter: 4
Average Path Length: 2.1241379310344826
In [23]:
# Convert to NetworkX graph for visualization
nx_g = g.as_networkx()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Visualize the NVG
¶
Now, let's visualize the NVG using the NetworkX library to get a graphical representation of the stock's closing prices.
In [24]:
# Visualize the NetworkX graph 'nx_g' using the Kamada-Kawai layout.
nx.draw_kamada_kawai(nx_g)
Visualization:
The NVG offers a unique perspective on the stock's closing prices. Each
node in the graph corresponds to a closing price, and edges are drawn between nodes based on the visibility criterion.
Conclusion:
¶
The Natural Visibility Graph is a powerful tool for transforming time series data, like stock prices, into a network. This transformation can reveal patterns and structures in the data that might not be immediately evident from the time series alone.
Horizontal Visibility Graph (HVG) Network
¶
The Horizontal Visibility Graph (HVG) is a method used to transform a time series into a network. It's a specific type of visibility graph that focuses on horizontal visibility between data points.
Concept
¶
The idea behind the HVG is similar to the general visibility graph, but with a specific criterion for visibility:
•
Each data point in the time series becomes a node in the graph. •
Two nodes (or data points) are connected by an edge if they can "see" each other horizontally. The criterion for "horizontal visibility" between two data points is defined as follows: Two data points ( $(n, a_n)$ ) and ( $(m, a_m)$ ) are connected if every data point ( k )
between ( n ) and ( m ) satisfies the condition:
$a_k < \min(a_n, a_m)$
In simpler terms, if you can draw a horizontal line between two data points without intersecting the time series curve at any other point, then those two data points are
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
said to be horizontally visible to each other, and hence, an edge is drawn between them.
Conclusion
¶
The Horizontal Visibility Graph provides a unique way to study time series data by transforming it into a network. This transformation can help in revealing hidden patterns, periodicities, or structures in the data that might not be immediately evident from the time series alone.
Exercise 6 Building a Horizontal Visibility Graph
¶
Problem Statement:
¶
Given a generated time series dataset, your task is to transform this dataset into a Horizontal Visibility Graph (HVG) using the ts2vg library. Once the HVG is constructed, visualize it using the NetworkX library to understand the underlying patterns in the time series.
Step 1: Import Necessary Libraries
¶
To begin, we need to import the required Python libraries and set up the environment.
In [25]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from ts2vg import HorizontalVG
Step 2: Generate the Time Series Dataset
¶
For this exercise, we'll generate a simple sine wave as our time series data.
In [26]:
ts = [1.0, 0.5, 0.3, 0.7, 1.0, 0.5, 0.3, 0.8]
# Plot the generated time series
plt.figure(figsize=(10, 5))
plt.plot(ts)
plt.title("Generated Time Series (Sine Wave)")
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 3: Construct the Horizontal Visibility Graph (HVG)
¶
Using the ts2vg library, we'll transform our time series data into a Horizontal Visibility Graph.
In [27]:
# Create a HorizontalVG (Variable Grid) graph object.
g = HorizontalVG()
# Build the graph 'g' using the time series data 'ts'.
g.build(ts)
# Convert the HorizontalVG graph to an igraph object, 'ig_g'.
ig_g = g.as_igraph()
# Print the igraph object, which represents the graph.
print(ig_g)
IGRAPH UN-- 8 12 --
+ attr: name (v)
+ edges (vertex names):
0 -- 1, 3, 4 2 -- 1, 3 4 -- 0, 3, 5, 7 6 -- 5, 7
1 -- 0, 2, 3 3 -- 0, 1, 2, 4 5 -- 4, 6, 7 7 -- 4, 5, 6
In [28]:
print('Number of Nodes:',ig_g.vcount())
print('Number of Links:',ig_g.ecount())
print('Average Degree:',np.mean(ig_g.degree()))
print('Network Diameter:',ig_g.diameter())
print('Average Path Length:',ig_g.average_path_length())
Number of Nodes: 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Number of Links: 12
Average Degree: 3.0
Network Diameter: 4
Average Path Length: 1.9285714285714286
Step 4: Visualize the HVG using NetworkX
¶
Let's first visualize the HVG using the NetworkX library.
In [29]:
# Convert the graph 'g' to a NetworkX graph, creating 'nx_g'.
nx_g = g.as_networkx()
# Visualize the NetworkX graph 'nx_g' using the Kamada-Kawai layout.
nx.draw_kamada_kawai(nx_g)
Conclusion:
¶
The Horizontal Visibility Graph is a powerful tool for transforming time series data into a network. By visualizing the HVG using NetworkX, we can gain a deeper understanding of the patterns and structures present in the time series data.
Exercise 7 Building a Horizontal Visibility Graph from Apple Stock's Closing Price
¶
Problem Statement:
¶
Given the historical closing prices of Apple Stock, your task is to transform this dataset
into a Horizontal Visibility Graph (HVG) using the ts2vg library. Once the HVG is constructed, visualize it using the NetworkX library to understand the underlying patterns in the stock prices.
Step 1: Import Necessary Libraries
¶
To begin, we need to import the required Python libraries and fetch the dataset.
In [30]:
# Step 1: Import Necessary Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
import networkx as nx
from ts2vg import HorizontalVG
Step 2: Load the Apple Stock Dataset
¶
For this exercise, we'll use the historical closing prices of Apple Stock
In [31]:
df_a = pd.read_csv('aapl.csv')
# Now, let's plot our dataset to observe it
df_a.plot('Date', 'Close', title='Time Series Line Graph', figsize=(10,8))
plt.title("Apple Stock's Closing Price")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True)
plt.show()
We will look at only a month of data because processing the whole dataset will take
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
some time.
In [32]:
df_a = df_a.head(30)
# Now, let's plot!
df_a.plot('Date', 'Close', title='Time Series Line Graph', figsize=(10,8))
plt.title("Apple Stock's Closing Price")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Step 3: Construct the Horizontal Visibility Graph (HVG)
¶
With the ts2vg library, we'll transform our closing prices data into a Horizontal Visibility
Graph.
In [33]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
# Create a HorizontalVG (Variable Grid) graph object.
g = HorizontalVG()
# Build the graph 'g' using data from the 'Close' column of DataFrame 'df_a'.
g.build(df_a.Close)
# Convert the HorizontalVG graph to an igraph object, 'ig_g'.
ig_g = g.as_igraph()
# Print the igraph object, which represents the graph.
print(ig_g)
IGRAPH UN-- 30 51 --
+ attr: name (v)
+ edges (vertex names):
0 -- 1, 10, 11, 12 13 -- 12, 14 26 --
24, 25, 27
1 -- 0, 2, 3, 4, 5, 7, 8, 10 14 -- 13, 15 27 --
24, 26, 28
2 -- 1, 3 15 -- 14, 16, 18 28 --
18, 23, 24, 27, 29
3 -- 1, 2, 4 16 -- 15, 17, 18 29 -- 28
4 -- 1, 3, 5 17 -- 16, 18
5 -- 1, 4, 6, 7 18 -- 15, 16, 17, 19, 21, 23, 28
6 -- 5, 7 19 -- 18, 20, 21
7 -- 1, 5, 6, 8 20 -- 19, 21
8 -- 1, 7, 9, 10 21 -- 18, 19, 20, 22, 23
9 -- 8, 10 22 -- 21, 23
10 -- 0, 1, 8, 9, 11 23 -- 18, 21, 22, 24, 28
11 -- 0, 10, 12 24 -- 23, 25, 26, 27, 28
12 -- 0, 11, 13 25 -- 24, 26
In [34]:
print('Number of Nodes:',ig_g.vcount())
print('Number of Links:',ig_g.ecount())
print('Average Degree:',np.mean(ig_g.degree()))
print('Network Diameter:',ig_g.diameter())
print('Average Path Length:',ig_g.average_path_length())
Number of Nodes: 30
Number of Links: 51
Average Degree: 3.4
Network Diameter: 11
Average Path Length: 4.875862068965517
In [35]:
nx_g = g.as_networkx()
Step 4: Visualize the HVG
¶
Now, let's visualize the HVG using the NetworkX library to get a graphical representation of the stock's closing prices.
In [36]:
nx.draw_kamada_kawai(nx_g)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Recurrence Plot and Recurrence Quantification Analysis (RQA)
¶
Recurrence Plot
¶
A Recurrence Plot
(RP) is a graphical representation used to visualize recurrent patterns in time series data. It provides a way to visualize the behavior of dynamical systems and identify patterns, periodicities, and structures in the data.
Basics:
¶
•
The recurrence plot is a two-dimensional square plot of size ($N \times N$), where (N) is the number of data points in the time series. •
Each point in the plot corresponds to a pair of time points in the time series. •
A point ($(i, j)$) is colored (or marked) if the state of the system at time (i) is close to its state at time (j), based on a predefined threshold. Applications:
¶
Recurrence plots are used in various domains, including:
•
Physics
: To study the dynamics of physical systems. •
Biology
: To analyze biological signals like EEG or ECG. •
Economics
: To study financial time series. •
Climate Science
: To analyze climatic data and identify patterns. Recurrence Quantification Analysis (RQA)
¶
Recurrence Quantification Analysis
(RQA) is a method used to quantify the structures and patterns observed in a recurrence plot. It provides numerical measures that describe the complexity and determinism of the time series.
Key Measures:
¶
1.
Recurrence Rate (RR)
: The proportion of recurrent points in the recurrence plot. 2.
Determinism (DET)
: The proportion of recurrent points that form diagonal lines. It indicates the predictability of the system. 3.
Laminarity (LAM)
: The proportion of recurrent points that form vertical or horizontal lines. It indicates the presence of laminar states or plateaus. 4.
Trapping Time (TT)
: The average length of the vertical or horizontal lines. It
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
indicates the average time the system stays in a state. 5.
Divergence (DIV)
: The inverse of the longest diagonal line. It indicates the average time for the system to diverge from a state. Applications:
¶
RQA is used to:
•
Quantify the complexity of time series data. •
Compare different time series or segments of a time series. •
Identify transitions or changes in the dynamics of a system. Conclusion:
¶
Both the Recurrence Plot and Recurrence Quantification Analysis provide powerful tools for the analysis of time series data. While the Recurrence Plot offers a visual representation of the recurrent patterns in the data, RQA provides numerical measures
that quantify these patterns, offering deeper insights into the dynamics of the system.
Exercise 8 Building a Recurrence Plot and Calculating RQA
¶
Problem Statement:
¶
Given a generated time series dataset, your task is to:
•
Construct a Recurrence Plot to visualize recurrent patterns in the data. •
Calculate key Recurrence Quantification Analysis (RQA) measures to quantify the
patterns observed in the recurrence plot. Step 1: Import Necessary Libraries
¶
To begin, we need to import the required Python libraries and set up the environment.
In [37]:
#!conda install -c conda-forge pyts
In [38]:
#!pip install numba
In [39]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
from pyts.image import RecurrencePlot
import warnings # Settings the warnings to be ignored warnings.filterwarnings('ignore')
Step 2: Generate the Time Series Dataset
¶
For this exercise, we'll generate a simple sine wave as our time series data.
In [40]:
# Step 2: Generate the Time Series Dataset
t = np.linspace(0, 10 * np.pi, 1000) # Generate time values
y = np.sin(t) # Generate sine wave values
# Plot the generated time series
plt.figure(figsize=(10, 5))
plt.plot(t, y)
plt.title("Generated Time Series (Sine Wave)")
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 3: Construct the Recurrence Plot
¶
Using the pyts library, we'll transform our time series data into a recurrence plot.
In [41]:
# Step 3: Construct the Recurrence Plot
rp = RecurrencePlot(threshold='point', percentage=20)
X_rp = rp.fit_transform(y.reshape(1, -1))
# Visualize the Recurrence Plot
plt.figure(figsize=(8, 8))
plt.imshow(X_rp[0], cmap='binary', origin='lower')
plt.title("Recurrence Plot")
plt.xlabel("Time")
plt.ylabel("Time")
plt.colorbar(label="Recurrence")
plt.tight_layout()
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Calculate RQA Measures
¶
To apply RQA to a time series in Python, you can use the pyrqa library, which is specifically designed for Recurrence Quantification Analysis. Firstly we will import all necessary modules from pyrqa library.
In [42]:
#!pip install pyrqa
In [43]:
from pyrqa.settings import Settings
from pyrqa.neighbourhood import FixedRadius
from pyrqa.computation import RQAComputation
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
from pyrqa.time_series import TimeSeries
from pyrqa.metric import EuclideanMetric
In [44]:
# Step 4: Calculate RQA Measures using pyRQA
# Create a TimeSeries object
time_series = TimeSeries(y, embedding_dimension=1, time_delay=1)
# Define the settings for the RQA computation
settings = Settings(time_series,
neighbourhood=FixedRadius(0.5),
similarity_measure=EuclideanMetric,
theiler_corrector=1)
# Compute the RQA measures
computation = RQAComputation.create(settings, verbose=True)
result = computation.run()
# Print the resulting RQA analysis.
np.float = float print(result)
[Platform 'Apple']
Vendor: Apple
Version: OpenCL 1.2 (Sep 30 2023 03:47:55)
Profile: FULL_PROFILE
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
[Device 'Intel(R) Core(TM) i7-8569U CPU @ 2.80GHz']
Vendor: Intel
Type: 2
Version: OpenCL 1.2 Profile: FULL_PROFILE
Max Clock Frequency: 2800
Global Mem Size: 17179869184
Address Bits: 64
Max Compute Units: 8
Max Work Group Size: 1024
Max Work Item Dimensions: 3
Max Work Item Sizes: [1024, 1, 1]
Local Mem Size: 32768
Max Mem Alloc Size: 4294967296
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
RQA Result:
===========
Minimum diagonal line length (L_min): 2
Minimum vertical line length (V_min): 2
Minimum white vertical line length (W_min): 2
Recurrence rate (RR): 0.383278
Determinism (DET): 0.999880
Average diagonal line length (L): 42.470222
Longest diagonal line length (L_max): 999
Divergence (DIV): 0.001001
Entropy diagonal lines (L_entr): 3.716565
Laminarity (LAM): 0.999943
Trapping time (TT): 54.892008
Longest vertical line length (V_max): 100
Entropy vertical lines (V_entr): 3.834467
Average white vertical line length (W): 84.340536
Longest white vertical line length (W_max): 134
Longest white vertical line length inverse (W_div): 0.007463
Entropy white vertical lines (W_entr): 4.715851
Ratio determinism / recurrence rate (DET/RR): 2.608758
Ratio laminarity / determinism (LAM/DET): 1.000063
Recurrence Plot:
The recurrence plot provides a visual representation of when the time series revisits a certain state. Dark regions in the plot indicate recurrent behavior.
RQA measures:
•
Recurrence Rate (RR): Proportion of recurrent points in the recurrence plot. •
Determinism (DET): Proportion of recurrent points that form diagonal lines. •
Average Diagonal Line Length (L): Average length of the diagonal lines. •
Longest Diagonal Line Length (L_max): Length of the longest diagonal line. •
Entropy of Diagonal Line Lengths (L_entr): Entropy of the diagonal line lengths distribution. •
Laminarity (LAM): Proportion of recurrent points forming vertical lines. •
Trapping Time (TT): Average length of the vertical lines. •
Longest Vertical Line Length (V_max): Length of the longest vertical line. •
Entropy of Vertical Line Lengths (V_entr): Entropy of the vertical line lengths distribution. •
Ratio Determinism / Recurrence Rate (DET/RR): Ratio of DET to RR. •
Ratio Laminarity / Determinism (LAM/DET): Ratio of LAM to DET. Conclusion:
¶
The Recurrence Plot and RQA measures provide powerful tools for analyzing time series data. They offer insights into the recurrent patterns and deterministic structures
in the data, which might not be immediately evident from the time series alone.
Exercise 9 Building a Recurrence Plot and Calculating RQA for AirPassengers Dataset
¶
Problem Statement:
¶
Given the AirPassengers time series dataset, which represents the monthly totals of international airline passengers:
•
Visualize the time series data. •
Construct a Recurrence Plot to visualize recurrent patterns in the data.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
•
Calculate key Recurrence Quantification Analysis (RQA) measures using the pyRQA library. Step 1: Import Necessary Libraries
¶
To start, we'll import the required Python libraries.
In [45]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from pyrqa.settings import Settings
from pyrqa.neighbourhood import FixedRadius
from pyrqa.computation import RQAComputation
from pyrqa.time_series import TimeSeries
from pyts.image import RecurrencePlot
Step 2: Load and Visualize the AirPassengers Dataset
¶
We'll load the AirPassengers dataset and visualize it to understand its structure and trend.
In [46]:
# Step 2: Load and Visualize the AirPassengers Dataset
data = sm.datasets.get_rdataset('AirPassengers').data
passengers = data['value'].values
plt.figure(figsize=(12, 6))
plt.plot(data['time'], passengers)
plt.title('AirPassengers Time Series')
plt.xlabel('Year')
plt.ylabel('Number of Passengers')
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 3: Construct the Recurrence Plot
¶
Using the pyts library, we'll transform our time series data into a recurrence plot.
In [47]:
# Step 3: Construct the Recurrence Plot
rp = RecurrencePlot(threshold='point', percentage=20)
X_rp = rp.fit_transform(passengers.reshape(1, -1))
plt.figure(figsize=(8, 8))
plt.imshow(X_rp[0], cmap='binary', origin='lower')
plt.title("Recurrence Plot")
plt.xlabel("Time")
plt.ylabel("Time")
plt.colorbar(label="Recurrence")
plt.tight_layout()
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Calculate RQA Measures using pyRQA
¶
We'll set up the necessary settings for RQA computation and then calculate the RQA measures.
In [48]:
# Step 4: Calculate RQA Measures using pyRQA
time_series = TimeSeries(passengers, embedding_dimension=1, time_delay=1)
settings = Settings(time_series,
neighbourhood=FixedRadius(0.5),
similarity_measure=EuclideanMetric,
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
theiler_corrector=1)
computation = RQAComputation.create(settings, verbose=True)
result = computation.run()
print(result)
[Platform 'Apple']
Vendor: Apple
Version: OpenCL 1.2 (Sep 30 2023 03:47:55)
Profile: FULL_PROFILE
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
[Device 'Intel(R) Core(TM) i7-8569U CPU @ 2.80GHz']
Vendor: Intel
Type: 2
Version: OpenCL 1.2 Profile: FULL_PROFILE
Max Clock Frequency: 2800
Global Mem Size: 17179869184
Address Bits: 64
Max Compute Units: 8
Max Work Group Size: 1024
Max Work Item Dimensions: 3
Max Work Item Sizes: [1024, 1, 1]
Local Mem Size: 32768
Max Mem Alloc Size: 4294967296
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
RQA Result:
===========
Minimum diagonal line length (L_min): 2
Minimum vertical line length (V_min): 2
Minimum white vertical line length (W_min): 2
Recurrence rate (RR): 0.009549
Determinism (DET): 0.000000
Average diagonal line length (L): nan
Longest diagonal line length (L_max): 1
Divergence (DIV): 1.000000
Entropy diagonal lines (L_entr): 0.000000
Laminarity (LAM): 0.080808
Trapping time (TT): 2.000000
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Longest vertical line length (V_max): 2
Entropy vertical lines (V_entr): 0.000000
Average white vertical line length (W): 63.364198
Longest white vertical line length (W_max): 143
Longest white vertical line length inverse (W_div): 0.006993
Entropy white vertical lines (W_entr): 4.818567
Ratio determinism / recurrence rate (DET/RR): 0.000000
Ratio laminarity / determinism (LAM/DET): inf
Recurrence Plot:
The recurrence plot provides a visual representation of when the time series revisits a certain state. Dark regions in the plot indicate recurrent behavior.
RQA Measures:
The output provides various RQA measures that quantify the patterns observed in the recurrence plot. These measures offer insights into the recurrent behavior, determinism, complexity, and other dynamical properties of the time series data.
Conclusion:
¶
The Recurrence Plot and RQA measures provide powerful tools for analyzing time series data. By applying these tools to the AirPassengers dataset, we can gain insights into the recurrent patterns and deterministic structures in the data, which might not be immediately evident from the time series alone.
Recurrence Network
¶
A Recurrence Network
is a complex network representation derived from a recurrence plot, which is a graphical tool used to visualize and analyze recurrent patterns in time series data. The concept of recurrence networks bridges the gap between nonlinear time series analysis and complex network theory.
Basics of Recurrence:
¶
Recurrence is a fundamental property of many dynamical systems, where states that have been visited once are visited again. This property can be visualized using a recurrence plot, where a point (i, j) is plotted if the state of the system at time i is close to its state at time j.
Constructing a Recurrence Network:
¶
A recurrence network is constructed from a recurrence plot by treating each time point
in the time series as a node and connecting two nodes if there's a point in the recurrence plot corresponding to those two time points. In other words, two nodes (or time points) are connected if the states of the system at those times are close to each other.
Characteristics and Measures:
¶
Recurrence networks can be characterized using various complex network measures, such as:
1.
Clustering Coefficient
: Measures the degree to which nodes in a graph tend to cluster together. 2.
Average Path Length
: Represents the average number of steps along the shortest paths for all possible pairs of network nodes. 3.
Degree Distribution
: Shows the distribution of the number of connections each
node has to other nodes. 4.
Transitivity
: Measures the probability that the adjacent vertices of a vertex are connected. Applications:
¶
Recurrence networks have been applied in various domains to analyze time series data, including:
•
Climate Science
: To study and detect climatic transitions and tipping points.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
•
Biology
: For analyzing patterns in biological data, such as heart rate or EEG signals. •
Physics
: To study the dynamics of various physical systems. •
Finance
: To analyze stock market data and understand market dynamics. Advantages:
¶
1.
Versatility
: Can be applied to any kind of time series data. 2.
Insightful
: Provides insights into the dynamics of the system generating the time series. 3.
Bridging Disciplines
: Combines techniques from nonlinear dynamics and network theory, allowing for a richer analysis. Conclusion:
¶
Recurrence networks provide a powerful tool for studying the dynamics of time series data. By transforming time series data into a network, one can gain insights into the recurrent patterns and underlying structures in the data, which might not be immediately evident from traditional time series analysis methods.
Exercise 10 Building, Visualizing, and Analyzing a Recurrence Network
¶
Problem Statement:
¶
Given a generated time series dataset:
•
Construct a Recurrence Plot to visualize recurrent patterns in the data. •
Transform the Recurrence Plot into a Recurrence Network. •
Visualize the Recurrence Network using the NetworkX library. •
Calculate and interpret key network measures. Step 1: Import Necessary Libraries
¶
To start, we'll import the required Python libraries.
In [49]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from pyts.image import RecurrencePlot
Step 2: Generate the Time Series Dataset
¶
For this exercise, we'll generate a simple sine wave as our time series data.
In [50]:
# Step 2: Generate the Time Series Dataset
t = np.linspace(0, 4 * np.pi, 100)
ts = np.sin(t) + np.cos(2*t)
# Plot the generated time series
plt.figure(figsize=(10, 5))
plt.plot(ts)
plt.title("Generated Time Series (Sine Wave)")
plt.xlabel("Time")
plt.ylabel("Value")
plt.grid(True)
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 3: Construct the Recurrence Plot
¶
We'll use the pyts library to transform our time series data into a recurrence plot.
In [51]:
# Step 3: Construct the Recurrence Plot
rp_model = RecurrencePlot(threshold='point', percentage=20)
X_rp = rp_model.fit_transform(np.array(ts).reshape(1, -1))
plt.figure(figsize=(8, 8))
plt.imshow(X_rp[0], cmap='binary', origin='lower')
plt.title("Recurrence Plot")
plt.xlabel("Time")
plt.ylabel("Time")
plt.colorbar(label="Recurrence")
plt.tight_layout()
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Transform the Recurrence Plot into a Recurrence Network
¶
We'll convert the binary recurrence plot into a graph using NetworkX.
In [52]:
# Step 4: Transform the Recurrence Plot into a Recurrence Network
G = nx.from_numpy_array(X_rp[0])
Step 5: Visualize the Recurrence Network
¶
We'll visualize the constructed recurrence network using NetworkX.
In [53]:
# Step 5: Visualize the Recurrence Network
plt.figure(figsize=(10, 10))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
pos = nx.spring_layout(G)
nx.draw(G, pos, node_size=50, edge_color='gray')
plt.title("Recurrence Network")
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 6: Calculate Network Measures
¶
We'll compute some basic network measures like the average degree and clustering coefficient.
In [54]:
# Step 6: Calculate Network Measures
avg_degree = np.mean(list(dict(G.degree()).values()))
clustering_coefficient = nx.average_clustering(G)
diameter = nx.diameter(G)
print(f"Average Degree: {avg_degree:.2f}")
print(f"Average Clustering Coefficient: {clustering_coefficient:.2f}")
print(f"Diameter of the Network: {diameter}")
Average Degree: 21.00
Average Clustering Coefficient: 0.77
Diameter of the Network: 14
Interpretation of Results
¶
Recurrence Network: The visualization provides a network representation of the recurrent patterns in the time series.
Network Measures:
¶
•
Average Degree: Indicates the average number of connections each node has. •
Average Clustering Coefficient: Measures the degree to which nodes in the graph
tend to cluster together. •
Diameter: Represents the longest shortest path between any two nodes in the network. Conclusion:
¶
The Recurrence Network and its measures provide a unique perspective on the time series data. By transforming the time series into a network, we can gain insights into its recurrent patterns and underlying structures.
Exercise 11 Building and Analyzing a Recurrence Network from Apple's Stock Closing Price
¶
Problem Statement:
¶
Given the closing price of Apple's stock:
•
Construct a Recurrence Plot to visualize recurrent patterns in the stock price. •
Transform the Recurrence Plot into a Recurrence Network. •
Visualize the Recurrence Network using the NetworkX library. •
Calculate and interpret key network measures. Step 1: Import Necessary Libraries
¶
To begin, we'll import the required Python libraries.
In [55]:
# Step 1: Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from pyts.image import RecurrencePlot
Step 2: Load the Apple Stock Closing Price Dataset
¶
For this exercise, we'll use a mock dataset representing Apple's stock closing price.
In [56]:
df_a = pd.read_csv('aapl.csv')
apple_stock_price = df_a['Close'].values
# Now, let's plot our dataset to observe it
plt.plot(apple_stock_price)
plt.title("Apple Stock's Closing Price")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
plt.ylabel("Value")
plt.grid(True)
plt.show()
Step 3: Construct the Recurrence Plot
¶
We'll use the pyts library to transform the stock closing price data into a recurrence plot.
In [57]:
# Step 3: Construct the Recurrence Plot
rp_model = RecurrencePlot(threshold='point', percentage=20)
X_rp = rp_model.fit_transform(np.array(apple_stock_price).reshape(1, -1))
plt.figure(figsize=(8, 8))
plt.imshow(X_rp[0], cmap='binary', origin='lower')
plt.title("Recurrence Plot")
plt.xlabel("Time")
plt.ylabel("Time")
plt.colorbar(label="Recurrence")
plt.tight_layout()
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 4: Transform the Recurrence Plot into a Recurrence Network
¶
We'll convert the binary recurrence plot into a graph using NetworkX
In [58]:
# Step 4: Transform the Recurrence Plot into a Recurrence Network
G = nx.from_numpy_array(X_rp[0])
Step 5: Visualize the Recurrence Network
¶
We'll visualize the constructed recurrence network using NetworkX.
In [59]:
# Step 5: Visualize the Recurrence Network
plt.figure(figsize=(10, 10))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
pos = nx.spring_layout(G)
nx.draw(G, pos, node_size=50, edge_color='gray')
plt.title("Recurrence Network of Apple's Stock Closing Price")
plt.show()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Step 6: Calculate Network Measures
¶
We'll compute some basic network measures like the average degree, clustering coefficient, and diameter of the network.
In [60]:
# Step 6: Calculate Network Measures
avg_degree = np.mean(list(dict(G.degree()).values()))
clustering_coefficient = nx.average_clustering(G)
diameter = nx.diameter(G)
print(f"Average Degree: {avg_degree:.2f}")
print(f"Average Clustering Coefficient: {clustering_coefficient:.2f}")
print(f"Diameter of the Network: {diameter}")
Average Degree: 252.40
Average Clustering Coefficient: 0.81
Diameter of the Network: 13
Interpretation of Results
¶
Recurrence Network: The visualization provides a network representation of the recurrent patterns in Apple's stock closing price.
Network Measures:
¶
•
Average Degree: Indicates the average number of connections each node has. •
Average Clustering Coefficient: Measures the degree to which nodes in the graph
tend to cluster together. •
Diameter: Represents the longest shortest path between any two nodes in the network. Conclusion:
¶
The Recurrence Network and its measures provide a unique perspective on Apple's stock closing price. By transforming the stock price into a network, we can gain insights into its recurrent patterns and underlying structures.
Summary:
¶
In the recent exercises, students have learned deeply advanced techniques for time series analysis, moving beyond traditional methods. You've learned to transform time series data into various forms, such as recurrence plots and networks like the Recurrence Network, Natural Visibility Network, and Horizontal Visibility Network. These transformations offer a unique lens to view and analyze time series, revealing intricate patterns and structures that might not be immediately evident in the raw sequential data.
Furthermore, the exercises introduced the recurrence quantification analysis (RQA) concept and various network measures. Both RQA and network measures can serve as
powerful feature extraction methods. RQA provides statistical measures that quantify patterns in recurrence plots, while network measures capture the structural properties of time series when represented as networks. These extracted features are rich descriptors of the underlying time series data.
With these features, students can train machine learning models for classification, regression, or clustering tasks. By feeding these descriptive features into models, we can harness the power of machine learning to analyze time series innovatively. This approach offers more profound insights and paves the way for potentially more accurate and better predictions, showcasing the versatility of combining time series analysis with machine learning.
Revised Date: November 18, 2023
¶
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help